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<120> VECTORS FOR EXPRESSION OF HML-2 POLYPEPTIDES 

<130> PP19482.0007 



<140> 10/587,032 

<141> 2006-07-24 

<150> PCT/US03/18666 

<151> 2003-06-13 

<150> 60/388831 

<151> 2002-06-16 

<150> 60/472189 

<151> 2003-05-20 



<160> 83 

<170> Patentin, version 3.5 



<210> 
<21i> 
<212> 
<213> 

<400> 1 

^tggggcaaa 

attcttttaa 

ataatagaac 

aaaagaattg 

gtatggaatg 

agcgtttcag 

aaaaaatccc 

gctcagtcaa 

ttaaaattag 

acaagtcctc 

gaaaataaga 

cggccacccc 

gcgccatacc 

ggtagtaaat 

caattcccag 

cccacagttg 

ggagtaaaac 

catggacata 

tctcaatttt 

aatagggctg 

aattggagta 

gctatctgcc 

aatacagtaa 

gttgctcaaa 

tatgaaaacg 

gcaggatcag 

cataaagcta 

acatttggaa 

gtcttaaata 



1 

1998 

DNA 

Human endogenous retrovirus, K family (herv-k) 



ctaaaagtaa 
aaagaggggg 
aattttgccc 
gtaaggaact 
attgggccat 
tttctgatgc 
agaaagaaac 
cgcaaaatgt 
aaggaaaagg 
ttccagcagg 
cccaaccgcc 
cagaaagtca 
ctcagccgcc 
tacatgaaat 
taacgttaga 
aggccagata 
agtatggacc 
gactcattcc 
tacaatttaa 
ccaatcctcc 
ctattagtca 
ttagagcctg 
gacaaggttc 
agtcaattgc 
ccaatcctga 
atgtaatctc 
tgcttatggc 
gaaaatgtta 
aacagaatat 



aattaaaagt 
agttaaagta 
atggtttcca 
aaaacaagca 
tattaaagca 
ccctggaagc 
ggaaggttta 
tgactataat 
tccagaatta 
tcaggtgcct 
agtagcctat 
gtatggatat 
cactaggaga 
tattgataaa 
accgatgcca 
caagtctttt 
caactcccct 
ttatgattgg 
gacttggtgg 
agttaacata 
acaagcatta 
ggaaaaaatc 
aaaagagccc 
tgatgaaaaa 
gtgtcaatca 
agaatatgta 
tcaagcaata 
taattgtggt 
aactattcaa 



aaatatgcct 
tctacaaaaa 
gaacaaggaa 
ggtaggaagg 
gctttagaac 
tgtataatag 
cattgcgaat 
caattacagg 
gtggggccat 
gtaacattac 
caatactggc 
ccaggaatgc 
cttaatccta 
tcaagaaagg 
cctggagaag 
tcgataaaaa 
tatatgagga 
gagattctgg 
attgatgggg 
gatgcagatc 
atgcaaaatg 
caagacccag 
tatcctgatt 
gcccgtaagg 
gccattaagc 
aaagcctgtg 
acaggagttg 
caaattggtc 
gcaactacaa 
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cttatctcag 
atctaatcaa 
ctttagatct 
gtaatatcat 
catttcaaac 
attgtaatga 
atgtagcaga 
aggtgatata 
cagagtctaa 
aacctcaaaa 
ctccggctga 
ccccagcacc 
cggcaccacc 
aaggagatac 
gagcccaaga 
agctaaaaga 
cattattaga 
caaaatcgtc 
tacaagaaca 
aactattagg 
aggccattga 
gaagtacctg 
ttgtggcaag 
tcatagtgga 
cattaaaagg 
atggaatcgg 
ttttaggagg 
acttaaaaaa 
caggtagaga 
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ctttattaaa 
gctatttcaa 
aaaagattgg 
tccacttaca 
agaagaagat 
aaacacaagg 
gccggtaatg 
tcctgaaacg 
accacgaggc 
gcaggttaaa 
acttcagtat 
acagggcagg 
tagtagacag 
tgaggcatgg 
gggagagcct 
tatgaaagag 
ttccattgct 
tctctcaccc 
ggtccgaaga 
aataggtcaa 
gcaagttaga 
cccctcattt 
gctccaagat 
gttgatggca 
aaaggttcct 
aggagctatg 
acaagttaga 
gaattgccca 
gccacctgac 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 
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ttatgtccaa gatgtaaaaa aggaaaacat tgggctagtc aatgtcgttc taaatttgat 1800 

aaaaatgggc aaccattgtc gggaaacgag caaaggggcc agcctcaggc cccacaacaa 1860 

actggggcat tcccaattca gccatttgtt cctcagggtt ttcagggaca acaaccccca 1920 

ctgtcccaag tgtttcaggg aataagccag ttaccacaat acaacaattg tcccccgcca 1980 

caagcggcag tgcagcag 1998 

<210> 2 
<211> 2001 
<212> DNA 

<213> Human endogenous retrovirus, K family (HERV-K) 
<400> 2 

atggggcaaa ctaaaagtaa aattaaaagt aaatatgcct cttatctcag ctttattaaa 60 

attcttttaa aaagaggggg agttaaagta tctacaaaaa atctaatcaa gctatttcaa 120 

ataatagaac aattttgccc atggtttcca gaacaaggaa ctttagatct aaaagattgg 180 

aaaagaattg gtaaggaact aaaacaagca ggtaggaagg gtaatatcat tccacttaca 240 

gtatggaatg attgggccat tattaaagca gctttagaac catttcaaac agaagaagat 300 

agcgtttcag tttctgatgc ccctggaagc tgtataatag attgtaatga aaacacaagg 360 

aaaaaatccc agaaagaaac ggaaggttta cattgcgaat atgtagcaga gccggtaatg 420 

gctcagtcaa cgcaaaatgt tgactataat caattacagg aggtgatata tcctgaaacg 480 

ttaaaattag aaggaaaagg tccagaatta gtggggccat cagagtctaa accacgaggc 540 

acaagtcctc ttccagcagg tcaggtgcct gtaacattac aacctcaaaa gcaggttaaa 600 

gaaaataaga cccaaccgcc agtagcctat caatactggc ctccggctga acttcagtat 660 

cggccacccc cagaaagtca gtatggatat ccaggaatgc ccccagcacc acagggcagg 720 

gcgccatacc ctcagccgcc cactaggaga cttaatccta cggcaccacc tagtagacag 780 

ggtagtaaat tacatgaaat tattgataaa tcaagaaagg aaggagatac tgaggcatgg 840 

caattcccag taacgttaga accgatgcca cctggagaag gagcccaaga gggagagcct 900 

cccacagttg aggccagata caagtctttt tcgataaaaa agctgaaaga tatgaaagag 960 

ggagtaaaac agtatggacc caactcccct tatatgagga cattattaga ttccattgct 1020 

catggacata gactcattcc ttatgattgg gagattctgg caaaatcgtc tctctcaccc 1080 

tctcaatttt tacaatttaa gacttggtgg attgatgggg tacaagaaca ggtccgaaga 1140 

aatagggctg ccaatcctcc agttaacata gatgcagatc aactattagg aataggtcaa 1200 

aattggagta ctattagtca acaagcatta atgcaaaatg aggccattga gcaagttaga 1260 

gctatctgcc ttagagcctg ggaaaaaatc caagacccag gaagtacctg cccctcattt 1320 

aatacagtaa gacaaggttc aaaagagccc tatcctgatt ttgtggcaag gctccaagat 1380 

gttgctcaaa agtcaattgc tgatgaaaaa gcccgtaagg tcatagtgga gttgatggca 1440 

tatgaaaacg ccaatcctga gtgtcaatca gccattaagc cattaaaagg aaaggttcct 1500 

gcaggatcag atgtaatctc agaatatgta aaagcctgtg atggaatcgg aggagctatg 1560 

tataaagcta tgcttatggc tcaagcaata acaggagttg ttttaggagg acaagttaga 1620 

acatttggaa gaaaatgtta taattgtggt caaattggtc acttaaaaaa gaattgccca 1680 

gtcttaaata aacagaatat aactattcaa gcaactacaa caggtagaga gccacctgac 1740 

ttatgtccaa gatgtaaaaa aggaaaacat tgggctagtc aatgtcgttc taaatttgat 1800 

aaaaatgggc aaccattgtc gggaaacgag caaaggggcc agcctcaggc cccacaacaa 1860 

actggggcat tcccaattca gccatttgtt cctcagggtt ttcagggaca acaaccccca 1920 

ctgtcccaag tgtttcaggg aataagccag ttaccacaat acaacaattg tcccccgcca 1980 

caagcggcag tgcagcagta g 2001 

<210> 3 
<211> 2004 
<212> DNA 

<213> Human endogenous retrovirus, K family (HERV-K) 
<400> 3 

atggggcaaa ctaaaagtaa aactaaaagt aaatatgcct cttatctcag ctttattaaa 60 

attcttttaa aaagaggggg agttagagta tctacaaaaa atctaatcaa gctatttcaa 120 

ataatagaac aattttgccc atggtttcca gaacaaggaa ctttagatct aaaagattgg 180 

aaaagaattg gcgaggaact aaaacaagca ggtagaaagg gtaatatcat tccacttaca 240 

gtatggaatg attgggccat tattaaagca gctttagaac catttcaaac aaaagaagat 300 

agcgtttcag tttctgatgc ccctggaagc tgtgtaatag attgtaatga aaagacaggg 360 

agaaaatccc agaaagaaac agaaagttta cattgcgaat atgtaacaga gccagtaatg 420 

gctcagtcaa cgcaaaatgt tgactataat caattacagg gggtgatata tcctgaaacg 480 

ttaaaattag aaggaaaagg tccagaatta gtggggccat cagagtctaa accacgaggg 540 

ccaagtcctc ttccagcagg tcaggtgccc gtaacattac aacctcaaac gcaggttaaa 600 
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gaaaataaga cccaaccgcc agtagcttat caatactggc cgccggctga acttcagtat 660 

ctgccacccc cagaaagtca gtatggatat ccaggaatgc ccccagcact acagggcagg 720 

gcgccatatc ctcagccgcc cactgtgaga cttaatccta cagcatcacg tagtggacaa 780 

ggtggtacac tgcacgcagt cattgatgaa gccagaaaac agggagatct tgaggcatgg 840 

cggttcctgg taattttaca actggtacag gccggggaag agactcaagt aggagcgcct 900 

gcccgagctg agactagatg tgaacctttc accatgaaaa tgttaaaaga tataaaggaa 960 

ggagttaaac aatatggatc caactcccct tatataagaa cattattaga ttccattgct 1020 

catggaaata gacttactcc ttatgactgg gaaagtttgg ccaaatcttc cctttcatcc 1080 

tctcagtatc tacagtttaa aacctggtgg attgatggag tacaagaaca ggtacgaaaa 1140 

aatcaggcta ctaagcccac tgttaatata gacgcagacc aattgttagg aacaggtcca 1200 

aattggagca ccattaacca acaatcagtg atgcagaatg aggctattga acaagtaagg 1260 

gctatttgcc tcagggcctg gggaaaaatt caggacccag gaacagcttt ccctattaat 1320 

tcaattagac aaggctctaa agagccatat cctgactttg tggcaagatt acaagatgct 1380 

gctcaaaagt ctattacaga tgacaatgcc cgaaaagtta ttgtagaatt aatggcctat 1440 

gaaaatgcaa atccagaatg tcagtcggcc ataaagccat taaaaggaaa agttccagca 1500 

ggagttgatg taattacaga atatgtgaag gcttgtgatg ggattggagg agctatgcat 1560 

aaggcaatgc taatggctca agcaatgagg gggctcactc taggaggaca agttagaaca 1620 

tttgggaaaa aatgttataa ttgtggtcaa atcggtcatc tgaaaaggag ttgcccagtc 1680 

ttaaataaac agaatataat aaatcaagct attacagcaa aaaataaaaa gccatctggc 1740 

ctgtgtccaa aatgtggaaa aggaaaacat tgggccaatc aatgtcattc taaatttgat 1800 

aaagatgggc aaccattgtc gggaaacagg aagaggggcc agcctcaggc cccccaacaa 1860 

actggggcat tcccagttca actgtttgtt cctcagggtt ttcaaggaca acaaccccta 1920 

cagaaaatac caccacttca gggagtcagc caattacaac aatccaacag ctgtcccgcg 1980 

ccacagcagg cagcgccaca gtag 2004 

<210> 4 
<211> 852 
<212> DNA 

<213> Human endogenous retrovirus, K family (HERV-K) 
<400> 4 

atggggcaaa ctaaaagtaa aattaaaagt aaatatgcct cttatctcag ctttattaaa 60 

attcttttaa aaagaggggg agttaaagta tctacaaaaa atctaatcaa gctatttcaa 120 

ataatagaac aattttgccc atggtttcca gaacaaggaa cttcagatct aaaagattgg 180 

aaaagaattg gtaaggaact aaaacaagca ggtaggaagg gtaatatcat tccacttaca 240 

gtatggaatg attgggccat tattaaagca gctttagaac catttcaaac agaagaagat 300 

agcatttcag tttctgatgc ccctggaagc tgtttaatag attgtaatga aaacacaagg 360 

aaaaaatccc agaaagaaac cgaaagttta cattgcgaat atgtagcaga gccggtaatg 420 

gctcagtcaa cgcaaaatgt tgactataat caattacagg aggtgatata tcctgaaacg 480 

ttaaaattag aaggaaaagg tccagaatta atggggccat cagagtctaa accacgaggc 540 

acaagtcctc ttccagcagg tcaggtgctc gtaagattac aacctcaaaa gcaggttaaa 600 

gaaaataaga cccaaccgca agtagcctat caatactgcc gctggctgaa cttcagtatc 660 

ggccaccccc agaaagtcag tatggatatc caggaatgcc cccagcacca cagggcaggg 720 

cgccatacca tcagccgccc actaggagac ttaatcctat ggcaccacct agtagacagg 780 

gtagtgaatt acatgaaatt attgataaat caagaaagga aggagatact gaggcatggc 840 

aattcccagt aa 852 

<210> 5 
<211> 666 
<212> PRT 

<213> Human endogenous retrovirus, K family (HERV-k) 
<400> 5 

Met Gly Gin Thr Lys Ser Lys lie Lys Ser Lys Tyr Ala Ser Tyr Leu 
15 10 15 

Ser Phe lie Lys lie Leu Leu Lys Arg Gly Gly val Lys val ser Thr 
20 25 30 

Lys Asn Leu lie Lys Leu Phe Gin lie lie Glu Gin Phe Cys Pro Trp 
35 40 45 

Phe Pro Glu Gin Gly Thr Leu Asp Leu Lys Asp Trp Lys Arg He Gly 
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Lys Glu Leu Lys Gin Ala Gly Arg Lys Gly Asn lie lie Pro Leu Thr 
65 70 75 80 

val Trp Asn Asp Trp Ala lie lie Lys Ala Ala Leu Glu Pro Phe Gin 
85 90 95 

Thr Glu Glu Asp Ser val Ser Val Ser Asp Ala Pro Gly Ser Cys He 
100 105 110 

lie Asp cys Asn Glu Asn Thr Arg Lys Lys ser Gin Lys Glu Thr Glu 
115 120 125 

Gly Leu His Cys Glu Tyr val Ala Glu Pro Val Met Ala Gin Ser Thr 
130 135 140 

Gin Asn val Asp Tyr Asn Gin Leu Gin Glu val lie Tyr Pro Glu Thr 
145 150 155 160 

Leu Lys Leu Glu Gly Lys Gly Pro Glu Leu val Gly Pro Ser Glu Ser 
165 170 175 

Lys Pro Arg Gly Thr Ser Pro Leu Pro Ala Gly Gin Val Pro val Thr 
180 185 190 

Leu Gin Pro Gin Lys Gin Val Lys Glu Asn Lys Thr Gin Pro Pro Val 
195 200 205 

Ala Tyr Gin Tyr Trp Pro Pro Ala Glu Leu Gin Tyr Arg Pro Pro Pro 
210 215 220 

Glu Ser Gin Tyr Gly Tyr Pro Gly Met Pro Pro Ala Pro Gin Gly Arg 
225 230 235 240 

Ala Pro Tyr Pro Gin Pro Pro Thr Arg Arg Leu Asn Pro Thr Ala Pro 
245 250 255 

Pro Ser Arg Gin Gly Ser Lys Leu His Glu lie lie Asp Lys Ser Arg 
260 265 270 

Lys Glu Gly Asp Thr Glu Ala Trp Gin Phe Pro val Thr Leu Glu Pro 
275 280 285 

Met Pro Pro Gly Glu Gly Ala Gin Glu Gly Glu Pro Pro Thr val Glu 
290 295 300 

Ala Arg Tyr Lys Ser Phe Ser lie Lys Lys Leu Lys Asp Met Lys Glu 
305 310 315 320 

Gly val Lys Gin Tyr Gly Pro Asn ser Pro Tyr Met Arg Thr Leu Leu 
325 330 335 

Asp Ser lie Ala His Gly His Arg Leu lie Pro Tyr Asp Trp Glu lie 
340 345 350 

Leu Ala Lys Ser Ser Leu Ser Pro Ser Gin Phe Leu Gin Phe Lys Thr 
355 360 365 

Trp Trp lie Asp Gly val Gin Glu Gin val Arg Arg Asn Arg Ala Ala 
370 375 380 

Asn Pro Pro val Asn lie Asp Ala Asp Gin Leu Leu Gly lie Gly Gin 
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385 390 395 400 

Asn Trp ser Thr lie Ser Gin Gin Ala Leu Met Gin Asn Glu Ala He 
405 410 415 

Glu Gin val Arg Ala lie Cys Leu Arg Ala Trp Glu Lys lie Gin Asp 

420 425 430 

Pro Gly Ser Thr Cys Pro Ser Phe Asn Thr val Arg Gin Gly Ser Lys 
435 440 445 

Glu Pro Tyr Pro Asp Phe val Ala Arg Leu Gin Asp val Ala Gin Lys 
450 455 460 

Ser lie Ala Asp Glu Lys Ala Arg Lys val lie val Glu Leu Met Ala 
465 470 475 480 

Tyr Glu Asn Ala Asn Pro Glu Cys Gin ser Ala lie Lys Pro Leu Lys 
485 490 495 

Gly Lys val Pro Ala Gly ser Asp Val lie Ser Glu Tyr val Lys Ala 
500 505 510 

Cys Asp Gly lie Gly Gly Ala Met Tyr Lys Ala Met Leu Met Ala Gin 
515 520 525 

Ala He Thr Gly val val Leu Gly Gly Gin val Arg Thr Phe Gly Arg 
530 535 540 

Lys Cys Tyr Asn Cys Gly Gin lie Gly His Leu Lys Lys Asn Cys Pro 
545 550 555 560 

val Leu Asn Lys Gin Asn He Thr lie Gin Ala Thr Thr Thr Gly Arg 
565 570 575 

Glu Pro Pro Asp Leu cys Pro Arg Cys Lys Lys Gly Lys His Trp Ala 
580 585 590 

Ser Gin Cys Arg Ser Lys Phe Asp Lys Asn Gly Gin Pro Leu Ser Gly 
595 600 605 

Asn Glu Gin Arg Gly Gin Pro Gin Ala Pro Gin Gin Thr Gly Ala Phe 
610 615 620 

Pro lie Gin Pro Phe val Pro Gin Gly Phe Gin Gly Gin Gin Pro Pro 
625 630 635 640 

Leu Ser Gin val Phe Gin Gly lie Ser Gin Leu Pro Gin Tyr Asn Asn 
645 650 655 

Cys Pro Pro Pro Gin Ala Ala Val Gin Gin 
660 665 

<210> 6 
<211> 667 
<212> PRT 

<213> Human endogenous retrovirus, K family (HERV-K) 
<400> 6 

Met Gly Gin Thr Lys ser Lys Thr Lys ser Lys Tyr Ala Ser Tyr Leu 
15 10 15 

Ser Phe lie Lys lie Leu Leu Lys Arg Gly Gly val Arg Val Ser Thr 
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20 25 30 

Lys Asn Leu lie Lys Leu Phe Gin lie lie Glu Gin Phe Cys Pro Trp 
35 40 45 

Phe Pro Glu Gin Gly Thr Leu Asp Leu Lys Asp Trp Lys Arg lie Gly 
50 55 60 

Glu Glu Leu Lys Gin Ala Gly Arg Lys Gly Asn lie lie Pro Leu Thr 
65 70 75 80 

. val Trp Asn Asp Trp Ala lie lie Lys Ala Ala Leu Glu Pro Phe Gin 
85 90 95 

Thr Lys Glu Asp Ser Val Ser val Ser Asp Ala Pro Gly Ser cys Val 
100 105 110 

lie Asp Cys Asn Glu Lys Thr Gly Arg Lys Ser Gin Lys Glu Thr Glu 
115 120 125 

ser Leu His Cys Glu Tyr Val Thr Glu Pro val Met Ala Gin ser Thr 
130 135 140 

Gin Asn val Asp Tyr Asn Gin Leu Gin Gly val lie Tyr Pro Glu Thr 
145 150 155 160 

Leu Lys Leu Glu Gly Lys Gly Pro Glu Leu val Gly Pro ser Glu ser 
165 170 175 

Lys Pro Arg Gly Pro Ser Pro Leu Pro Ala Gly Gin val Pro Val Thr 
180 185 190 

Leu Gin Pro Gin Thr Gin Val Lys Glu Asn Lys Thr Gin Pro Pro val 
195 200 205 

Ala Tyr Gin Tyr Trp Pro Pro Ala Glu Leu Gin Tyr Leu Pro Pro Pro 
210 215 220 

Glu Ser Gin Tyr Gly Tyr Pro Gly Met Pro Pro Ala Leu Gin Gly Arg 
225 230 235 240 

Ala Pro Tyr Pro Gin Pro Pro Thr val Arg Leu Asn Pro Thr Ala ser 

245 250 255 

Arg ser Gly Gin Gly Gly Thr Leu His Ala Val lie Asp Glu Ala Arg 
260 265 270 

Lys Gin Gly Asp Leu Glu Ala Trp Arg Phe Leu val lie Leu Gin Leu 
275 280 285 

val Gin Ala Gly Glu Glu Thr Gin val Gly Ala Pro Ala Arg Ala Glu 
290 295 300 

Thr Arg Cys Glu Pro Phe Thr Met Lys Met Leu Lys Asp lie Lys Glu 
305 310 315 320 

Gly val Lys Gin Tyr Gly Ser Asn Ser Pro Tyr lie Arg Thr Leu Leu 
325 330 335 

Asp ser lie Ala His Gly Asn Arg Leu Thr Pro Tyr Asp Trp Glu Ser 
340 345 350 

Leu Ala Lys Ser Ser Leu Ser ser Ser Gin Tyr Leu Gin Phe Lys Thr 
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355 360 365 

Trp Trp lie Asp Gly Val Gin Glu Gin val Arg Lys Asn Gin Ala Thr 
370 375 380 

Lys Pro Thr Val Asn lie Asp Ala Asp Gin Leu Leu Gly Thr Gly Pro 
385 390 395 400 

Asn Trp ser Thr lie Asn Gin Gin Ser val Met Gin Asn Glu Ala lie 
405 410 415 

Glu Gin val Arg Ala lie Cys Leu Arg Ala Trp Gly Lys lie Gin Asp 
420 425 430 

Pro Gly Thr Ala Phe Pro lie Asn ser lie Arg Gin Gly ser Lys Glu 
435 440 445 

Pro Tyr Pro Asp Phe Val Ala Arg Leu Gin Asp Ala Ala Gin Lys ser 
450 455 460 

lie Thr Asp Asp Asn Ala Arg Lys Val lie val Glu Leu Met Ala Tyr 
465 470 475 480 

Glu Asn Ala Asn Pro Glu Cys Gin ser Ala lie Lys Pro Leu Lys Gly 
485 490 495 

Lys Val Pro Ala Gly val Asp val lie Thr Glu Tyr val Lys Ala Cys 
500 505 510 

Asp Gly lie Gly Gly Ala Met His Lys Ala Met Leu Met Ala Gin Ala 
515 520 525 

Met Arg Gly Leu Thr Leu Gly Gly Gin val Arg Thr Phe Gly Lys Lys 
530 535 540 

Cys Tyr Asn cys Gly Gin lie Gly His Leu Lys Arg ser cys Pro val 
545 550 555 560 

Leu Asn Lys Gin Asn He lie Asn Gin Ala lie Thr Ala Lys Asn Lys 
565 570 575 

Lys Pro Ser Gly Leu Cys Pro Lys Cys Gly Lys Gly Lys His Trp Ala 
580 585 590 

Asn Gin Cys His Ser Lys Phe Asp Lys Asp Gly Gin Pro Leu ser Gly 
595 600 605 

Asn Arg Lys Arg Gly Gin Pro Gin Ala Pro Gin Gin Thr Gly Ala Phe 
610 615 620 

Pro val Gin Leu Phe val Pro Gin Gly Phe Gin Gly Gin Gin Pro Leu 
625 630 635 640 

Gin Lys lie Pro Pro Leu Gin Gly val Ser Gin Leu Gin Gin ser Asn 
645 650 655 

Ser Cys Pro Ala Pro Gin Gin Ala Ala Pro Gin 
660 665 

<210> 7 
<211> 283 
<212> PRT 

<213> Human endogenous retrovirus, K family (HERV-K) 
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<400> 7 

Met Gly Gin Thr Lys Ser Lys lie Lys ser Lys Tyr Ala Ser Tyr Leu 
15 10 15 

Ser Phe lie Lys lie Leu Leu Lys Arg Gly Gly val Lys val ser Thr 

20 25 30 

Lys Asn Leu lie Lys Leu Phe Gin lie lie Glu Gin Phe Cys Pro Trp 
35 40 45 

Phe Pro Glu Gin Gly Thr Ser Asp Leu Lys Asp Trp Lys Arg lie Gly 
50 55 60 

Lys Glu Leu Lys Gin Ala Gly Arg Lys Gly Asn lie lie Pro Leu Thr 

65 70 75 80 

val Trp Asn Asp Trp Ala lie lie Lys Ala Ala Leu Glu Pro Phe Gin 
85 90 95 

Thr Glu Glu Asp Ser lie Ser val Ser Asp Ala Pro Gly ser Cys Leu 
100 105 110 

lie Asp Cys Asn Glu Asn Thr Arg Lys Lys Ser Gin Lys Glu Thr Glu 

115 120 125 

Ser Leu His Cys Glu Tyr val Ala Glu Pro val Met Ala Gin ser Thr 
130 135 140 

Gin Asn Val Asp Tyr Asn Gin Leu Gin Glu val lie Tyr Pro Glu Thr 
145 150 155 160 

Leu Lys Leu Glu Gly Lys Gly Pro Glu Leu Met Gly Pro ser Glu ser 
165 170 175 

Lys Pro Arg Gly Thr ser Pro Leu Pro Ala Gly Gin val Leu val Arg 
180 185 190 

Leu Gin Pro Gin Lys Gin val Lys Glu Asn Lys Thr Gin Pro Gin val 
195 200 205 

Ala Tyr Gin Tyr Cys Arg Trp Leu Asn Phe ser lie Gly His Pro Gin 

210 215 220 

Lys Val Ser Met Asp lie Gin Glu Cys Pro Gin His His Arg Ala Gly 
225 230 235 240 

Arg His Thr lie Ser Arg Pro Leu Gly Asp Leu lie Leu Trp His His 
245 250 255 

Leu val Asp Arg val Val Asn Tyr Met Lys Leu Leu lie Asn Gin Glu 
260 265 270 

Arg Lys Glu lie Leu Arg His Gly Asn Ser Gin 
275 280 

<210> 8 
<211> 434 
<212> PRT 

<213> Human endogenous retrovirus, K family (herv-k) 
<400> 8 

Met Pro Pro Ala Pro Gin Gly Arg Ala Pro Tyr His Gin Pro Pro Thr 
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1 5 10 15 

Arg Arg Leu Asn Pro Met Ala Pro Pro ser Arg Gin Gly Ser Glu Leu 
20 25 30 

His Glu lie lie Asp Lys Ser Arg Lys Glu Gly Asp Thr Glu Ala Trp 

35 40 45 

Gin Phe Pro Val Thr Leu Glu Pro Met Pro Pro Gly Glu Gly Ala Gin 
50 55 60 

Glu Gly Glu Pro Pro Thr val Glu Ala Arg Tyr Lys ser Phe Ser lie 
65 70 75 80 

Lys Met Leu Lys Asp Met Lys Glu Gly val Lys Gin Tyr Gly Pro Asn 
85 90 95 

Ser Pro Tyr Met Arg Thr Leu Leu Asp Ser lie Ala Tyr Gly His Arg 
100 105 110 

Leu lie Pro Tyr Asp Trp Glu lie Leu Ala Lys Ser ser Leu ser Pro 
115 120 125 

ser Gin Phe Leu Gin Phe Lys Thr Trp Trp lie Asp Gly val Gin Glu 
130 135 140 

Gin va1 Arg Arg Asn Arg Ala Ala Asn Pro Pro val Asn lie Asp Ala 
145 150 155 160 

Asp Gin Leu Leu Gly lie Gly Gin Asn Trp Ser Thr lie ser Gin Gin 
165 170 175 

Ala Leu Met Gin Asn Glu Ala lie Glu Gin val Arg Ala lie Cys Leu 
180 185 190 

Arg Ala Trp Glu Lys lie Gin Asp Pro Gly Ser Thr cys Pro Ser Phe 
195 200 205 

Asn Thr val Arg Gin Gly Ser Lys Glu Pro Tyr Pro Asp Phe val Ala 
210 215 220 

Arg Leu Gin Asp Val Ala Gin Lys Ser He Ala Asp Glu Lys Ala Gly 
225 230 235 240 

Lys Val lie val Glu Leu Met Ala Tyr Glu Asn Ala Asn Pro Glu Cys 
245 250 255 

Gin Ser Ala lie Lys Pro Leu Lys Gly Lys val Pro Ala Gly Ser Asp 
260 265 270 

val lie ser Glu Tyr val Lys Ala cys Asp Gly lie Gly Gly Ala Met 
275 280 285 

His Lys Ala Met Leu Met Ala Gin Ala He Thr Gly val val Leu Gly 
290 295 300 

Gly Gin val Arg Thr Phe Gly Gly Lys cys Tyr Asn Cys Gly Gin lie 
305 310 315 320 

Gly His Leu Lys Lys Asn Cys Pro val Leu Asn Lys Gin Asn lie Thr 
325 330 335 

lie Gin Ala Thr Thr Thr Gly Arg Glu Pro Pro Asp Leu Cys Pro Arg 
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340 345 350 

Cys Lys Lys Gly Lys His Trp Ala Ser Gin Cys Arg ser Lys Phe Asp 
355 360 365 

Lys Asn Gly Gin Pro Leu Ser Gly Asn Glu Gin Arg Gly Gin Pro Gin 
370 375 380 

Ala Pro Gin Gin Thr Gly Ala Phe Pro lie Gin Pro Phe Val Pro Gin 
385 390 395 400 

Gly Phe Gin Gly Gin Gin Pro Pro Leu Ser Gin Val Phe Gin Gly lie 
405 410 415 

Ser Gin Leu Pro Gin Tyr Asn Asn cys Pro Ser Pro Gin Ala Ala val 
420 425 430 

Gin Gin 

<210> 9 
<211> 666 
<212> PRT 

<213> Human endogenous retrovirus, K family (HERV-K) 
<400> 9 

Met Gly Gin Thr Lys Ser Lys lie Lys Ser Lys Tyr Ala Ser Tyr Leu 
15 10 15 

Ser Phe lie Lys lie Leu Leu Lys Arg Gly Gly val Lys val Ser Thr 
20 25 30 

Lys Asn Leu lie Lys Leu Phe Gin lie lie Glu Gin Phe Cys Pro Trp 
35 40 45 

Phe Pro Glu Gin Gly Thr Leu Asp Leu Lys Asp Trp Lys Arg lie Gly 
50 55 60 

Lys Glu Leu Lys Gin Ala Gly Arg Lys Gly Asn lie lie Pro Leu Thr 
65 70 75 80 

Val Trp Asn Asp Trp Ala lie lie Lys Ala Ala Leu Glu Pro Phe Gin 
85 90 95 

Thr Glu Glu Asp ser val Ser val Ser Asp Ala Pro Gly ser Cys lie 
100 105 110 

lie Asp Cys Asn Glu Asn Thr Gly Lys Lys ser Gin Lys Glu Thr Glu 
115 120 125 

Gly Leu His Cys Glu Tyr val Ala Glu Pro Val Met Ala Gin Ser Thr 
130 135 140 

Gin Asn Val Asp Tyr Asn Gin Leu Gin Glu val lie Tyr Pro Glu Thr . 
145 150 155 160 

Leu Lys Leu Glu Gly Lys Gly Pro Glu Leu val Gly Pro Ser Glu ser 
165 170 175 

Lys Pro Arg Gly Thr Ser Pro Leu Pro Ala Gly Gin val Pro val Thr 
180 185 190 

Leu Gin Pro Gin Lys Gin Val Lys Glu Asn Lys Thr Gin Pro Pro val 
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195 200 205 

Ala Tyr Gin Tyr Trp Pro Pro Ala Glu Leu Gin Tyr Arg Pro Pro Pro 
210 215 220 

Glu Ser Gin Tyr Gly Tyr Pro Gly Met Pro Pro Ala Pro Gin Gly An 
225 230 235 24 

Ala Pro Tyr Pro Gin Pro Pro Thr Arg Arg Leu Asn Pro Thr Ala Pro 
245 250 255 

Pro Ser Arg Gin Gly Ser Lys Leu His Glu lie lie Asp Lys Ser Arg 
260 265 270 

Lys Glu Gly Asp Thr Glu Ala Trp Gin Phe Pro val Thr Leu Glu Pro 
275 280 285 

Met Pro Pro Gly Glu Gly Ala Gin Glu Gly Glu Pro Pro Thr val Glu 
290 295 300 

Ala Arg Tyr Lys Ser Phe Ser lie Lys Lys Leu Lys Asp Met Lys Glu 
305 310 315 320 

Gly Val Lys Gin Tyr Gly Pro Asn ser Pro Tyr Met Arg Thr Leu Leu 

325 330 335 

Asp Ser lie Ala His Gly His Arg Leu lie Pro Tyr Asp Trp Glu lie 
340 345 350 

Gin Ala Lys Ser ser Leu Ser Pro ser Gin Phe Leu Gin Phe Lys Thr 
355 360 365 

Trp Trp He Asp Gly val Gin Glu Gin val Arg Arg Asn Arg Ala Ala 
370 375 380 

Asn Pro Pro val Asn lie Asp Ala Asp Gin Leu Leu Gly lie Gly Gin 
385 390 395 400 

Asn Trp Ser Thr lie ser Gin Gin Ala Leu Met Gin Asn Glu Ala lie 
405 410 415 

Glu Gin val Arg Ala He Cys Leu Arg Ala Trp Glu Lys He Gin Asp 
420 425 430 

Pro Gly Ser Thr Cys Pro ser Phe Asn Thr val Arg Gin Gly ser Lys 
435 440 445 

Glu Pro Tyr Pro Asp Phe val Ala Arg Leu Gin Asp val Ala Gin Lys 
450 455 460 

Ser He Ala Asp Glu Lys Ala Arg Lys val He val Glu Leu Met Ala 
465 470 475 480 

Tyr Glu Asn Ala Asn Pro Glu Cys Gin ser Ala He Lys Pro Leu Lys 
485 490 495 

Gly Lys val Pro Ala Gly Ser Asp val lie Ser Glu Tyr val Lys Ala 
500 505 510 

Cys Asp Gly He Gly Gly Ala Met His Lys Ala Met Leu Met Ala Gin 
515 520 525 

Ala lie Thr Gly val val Leu Gly Gly Gin val Arg Thr Phe Gly Arg 
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530 535 540 

Lys Cys Tyr Asn Cys Gly Gin lie Gly His Leu Lys 
545 550 555 

val Leu Asn Lys Gin Asn lie Thr lie Gin Ala Thr 
565 570 

Glu Pro Pro Asp Leu Cys Pro Arg cys Lys Lys Gly 
580 585 

ser Gin Cys Arg Ser Lys Phe Asp Lys Asn Gly Gin 
595 600 

Asn Glu Gin Arg Gly Gin Pro Gin Ala Pro Gin Gin 
610 615 620 

Pro lie Gin Pro Phe Val Pro Gin Gly Phe Gin Gly 
625 630 635 

Leu Ser Gin Val Phe Gin Gly lie Ser Gin Leu Pro 
645 650 

Cys Pro Pro Pro Gin Ala Ala Val Gin Gin 
660 665 

<210> 10 
<211> 1000 
<212> DNA 

<213> Human endogenous retrovirus, K family (herv-k) 
<400> 10 

atgggcaacc attgtcggga aacgagcaaa ggggccagcc tcaggcccca caacaaactg 60 

gggcattccc aattcagcca tttgttcctc agggttttca gggacaacaa cccccactgt 120 

cccaagtgtt tcagggaata agccagttac cacaatacaa caattgtccc ccgccacaag 180 

cggcagtgca gcagtagatt tatgtactat acaagcagtc tctctgcttc caggggagcc 240 

cccacaaaaa acccccacag gggtatatgg acccctgcct aaggggactg taggactaat 300 

cttgggacga tcaagtctaa atctaaaagg agttcaaatt catactagtg tggttgattc 360 

agactataaa ggcgaaattc aattggttat tagctcttca attccttgga gtgccagtcc 420 

aagagacagg attgctcaat tattactcct gccatacatt aagggtggaa atagtgaaat 480 

aaaaagaata ggagggcttg gaagcactga tccaacagga aaggctgcat attgggcaag 540 

tcaggtctca gagaacagac ctgtgtgtaa ggccattatt caaggaaaac agtttgaagg 600 

gttggtagac actggagcag atgtctctat cattgcttta aatcagtggc caaaaaattg 660 

gcctaaacaa aaggctgtta caggacttgt cggcataggc acagcctcag aagtgtatca 720 

aagtacggag attttacatt gcttagggcc agataatcaa gaaagtactg ttcagccaat 780 

gattacttca attcctctta atctgtgggg tcgagattta ttacaacaat ggggtgcgga 840 

aatcaccatg cccgctccat catatagccc cacgagtcaa aaaatcatga ccaagatggg 900 

atatatacca ggaaagggac tagggaaaaa tgaagatggc attaaaattc cagttgaggc 960 

taaaataaat caagaaagag aaggaatagg gaatccttgc 1000 

<210> 11 
<211> 1004 
<212> DNA 

<213> Human endogenous retrovirus, K family (herv-k) 
<400> 11 

atgggcaacc attgtcggga aacaggaaga ggggccagcc tcaggccccc caacaaactg 60 
gggcattccc agttcaactg tttgttcctc agggttttca aggacaacaa cccctacaga 120 
aaataccacc acttcaggga gtcagccaat tacaacaatc caacagctgt cccgcgccac 180 
agcaggcagc gccacagtag atttatgttc cacccaaatg gtctctttac tccctggaga 240 
gcccccacaa aagattccta gaggggtata tggcccgctg ccagaaggga gggtaggcct 300 
tattttaggg agatcaagtc taaatttgaa gggagtccaa attcatactg gggtaattta 360 
ttcagattat aaagggggaa ttcagttagt gatcagctcc actgttccct ggagtgccaa 420 
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Lys Asn Cys Pro 
560 

Thr Thr Gly Arg 

575 

Lys His Trp Ala 
590 

Pro Leu Ser Gly 
605 

Thr Gly Ala Phe 



Gin Gin Pro Pro 
640 

Gin Tyr Asn Asn 
655 
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tccaggtgat agaattgctc aattactgct tttgccttat gttaaaattg gggaaaacaa 480 

aacggaaaga acaggagggt ttggaagtac caaccctgca ggaaaagcca cttattgggc 540 

taatcaggtc tcagaggata gacccgtgtg tacagtcact attcagggaa agagtttgaa 600 

ggattagtgg atacccaggc tgatgtttct atcatcggca taggcaccgc ctcagaagtg 660 

tatcaaagtg ccatgatttt acattgtcta ggatctgata atcaagaaag tacggttcag 720 

cctatgatca cttctattcc aatcaattta tggggccgag acttgttaca acaatggcat 780 

gcagagatta ctatcccagc ctccctatac agccccagga atcaaaaaat catgactaaa 840 

atgggatagc tccctaaaaa gggactagga aagaatgaag atggcattaa agtcccaact 900 

gaggctgaaa aaaatcaaaa aaagaaaagg aatagggcat cctttttaga agcggtcact 960 

gtagagcctc caaaacccat tccattaatt tggggggaaa aaaa 

<210> 12 
<211> 279 
<212> DNA 

<213> Human endogenous retrovirus, K family (HERV-K) 
<400> 12 

atggagattt tacattgctt agggccagat aatcaagaaa gtactgttca gccaatgatt 
acttcaattc ctcttaatct gtggggtcga gatttattac aacaatgggg tgcggaaatc 
accatgcccg ctccattata tagccccacg agtcaaaaaa tcatgaccaa gatgggatat 
ataccaggaa agggactagg gaaaaatgaa gatggcatta aagttccagt tgaggctaaa 
ataaatcaag aaagagaagg aatagggtat cctttttag 

<210> 13 
<211> 92 
<212> PRT 

<213> Human endogenous retrovirus, K family (HERV-K) 
<400> 13 

Met Glu lie Leu His Cys Leu Gly Pro Asp Asn Gin Glu Ser Thr Val 
15 10 15 

Gin Pro Met lie Thr ser lie Pro Leu Asn Leu Trp Gly Arg Asp Leu 
20 25 30 

Leu Gin Gin Trp Gly Ala Glu lie Thr Met Pro Ala Pro Leu Tyr ser 
35 40 45 

Pro Thr ser Gin Lys lie Met Thr Lys Met Gly Tyr lie Pro Gly Lys 
50 55 60 

Gly Leu Gly Lys Asn Glu Asp Gly lie Lys val Pro val Glu Ala Lys 
65 70 75 80 

lie Asn Gin Glu Arg Glu Gly lie Gly Tyr Pro Phe 
85 90 

<210> 14 
<211> 333 
<212> PRT 

<213> Human endogenous retrovirus, K family (HERV-K) 
. <400> 14 

Trp Ala Thr lie val Gly Lys Arg Ala Lys Gly Pro Ala Ser Gly Pro 
15 10 15 

Thr Thr Asn Trp Gly lie Pro Asn ser Ala lie Cys Ser Ser Gly Phe 
20 25 30 

Ser Gly Thr Thr Thr Pro Thr Val Pro Ser val Ser Gly Asn Lys Pro 
35 40 45 

val Thr Thr lie Gin Gin Leu ser Pro Ala Thr Ser Gly Ser Ala Ala 
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50 55 60 

val Asp Leu Cys Thr lie Gin Ala Val Ser Leu Leu Pro Gly Glu Pro 
65 70 75 80 

Pro Gin Lys Thr Pro Thr Gly Val Tyr Gly Pro Leu Pro Lys Gly Thr 
85 90 95 

val Gly Leu lie Leu Gly Arg ser ser Leu Asn Leu Lys Gly Val Gin 
100 105 110 

lie His Thr ser val val Asp ser Asp Tyr Lys Gly Glu lie Gin Leu 
115 120 125 

Val lie ser Ser Ser lie Pro Trp ser Ala Ser Pro Arg Asp Arg lie 
130 135 140 

Ala Gin Leu Leu Leu Leu Pro Tyr lie Lys Gly Gly Asn Ser Glu lie 
145 150 155 160 

Lys Arg lie Gly Gly Leu Gly ser Thr Asp Pro Thr Gly Lys Ala Ala 
165 170 175 

Tyr Trp Ala Ser Gin Val Ser Glu Asn Arg Pro val Cys Lys Ala lie 
180 185 190 

lie Gin Gly Lys Gin Phe Glu Gly Leu val Asp Thr Gly Ala Asp val 
195 200 205 

Ser lie lie Ala Leu Asn Gin Trp Pro Lys Asn Trp Pro Lys Gin Lys 
210 215 220 

Ala val Thr Gly Leu val Gly lie Gly Thr Ala Ser Glu val Tyr Gin 
225 230 235 240 

Ser Thr Glu lie Leu His Cys Leu Gly Pro Asp Asn Gin Glu Ser Thr 
245 250 255 

Val Gin Pro Met lie Thr Ser lie Pro Leu Asn Leu Trp Gly Arg Asp 
260 265 270 

Leu Leu Gin Gin Trp Gly Ala Glu lie Thr Met Pro Ala Pro Ser Tyr 
275 280 285 

ser Pro Thr Ser Gin Lys He Met Thr Lys Met Gly Tyr lie Pro Gly 
290 295 300 

Lys Gly Leu Gly Lys Asn Glu Asp Gly lie Lys lie Pro val Glu Ala 
305 310 315 320 

Lys lie Asn Gin Glu Arg Glu Gly lie Gly Asn Pro Cys 
325 330 

<210> 15 
<211> 2896 
<212> DNA 

<213> Human endogenous retrovirus, K family (herv-k) 
<400> 15 

atggcattaa aattccagtt gaggctaaaa taaatcaaga aagagaagga atagggaatc 60 
cttgctaggg gcggccactg tagagcctcc taaacccata ccattaactt ggaaaacaga 120 
aaaaccagtg tgggtaaatc agtggccgct accaaaacaa aaactggagg ctttacattt 180 
attagcaaat gaacagttag aaaagggtca tattgagcct tcgttctcac cttggaattc 240 
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tcctgtgttt gtaattcaga agaaatcagg caaatggcgt atgttaactg acttaagggc 300 

tgtaaacgcc gtaattcaac ccatggggcc tctccaaccc gggttgccct ctccggccat 360 

gatcccaaaa gattggcctt taattataat tgatctaaag gattgctttt ttaccatccc 420 

tctggcagag caggattgcg aaaaatttgc ctttactata ccagccataa ataataaaga 480 

accagccacc aggtttcagt ggaaagtgtt acctcaggga atgcttaata gtccaactat 540 

ttgtcagact tttgtaggtc gagctcttca accagttaga gaaaagtttt cagactgtta 600 

tattattcat tgtattgatg atattttatg tgctgcagaa acgaaagata aattaattga 660 

ctgttataca tttctgcaag cagaggttgc caatgctgga ctggcaatag catctgataa 720 

gatccaaacc tctactcctt ttcattattt agggatgcag atagaaaata gaaaaattaa 780 

gccacaaaaa atagaaataa gaaaagacac attaaaaaca ctaaatgatt ttcaaaaatt 840 

actaggagat attaattgga ttcggccaac tctaggcatt cctacttatg ccatgtcaaa 900 

tttgttctct atcttaagag gagactcaga cttaaatagt aaaagaatgt taaccccaga 960 

ggcaacaaaa gaaattaaat tagtggaaga aaaaattcag tcagcgcaaa taaatagaat 1020 

agatccctta gccccactcc aacttttgat ttttgccact gcacattctc caacaggcat 1080 

cattattcaa aatactgatc ttgtggagtg gtcattcctt cctcacagta cagttaagac 1140 

ttttacattg tacttggatc aaatagctac attaatcggt cagacaagat tacgaataat 1200 

aaaattatgt gggaatgacc cagacaaaat agttgtccct ttaaccaagg aacaagttag 1260 

acaagccttt atcaattctg gtgcatggaa gattggtctt gctaattttg tgggaattat 1320 

tgataatcat tacccaaaaa caaagatctt ccagttctta aaattgacta cttggattct 1380 

acctaaaatt accagacgtg aacctttaga aaatgctcta acagtattta ctgatggttc 1440 

cagcaatgga aaagcagctt acacaggacc gaaagaacga gtaatcaaaa ctccatatca 1500 

atcggctcaa agagcagagt tggttgcagt cattacagtg ttacaagatt ttgaccaacc 1560 

tatcaatatt atatcagatt ctgcatatgt agtacaggct acaagggatg ttgagacagc 1620 

tctaattaaa tatagcatgg atgatcagtt aaaccagcta ttcaatttat tacaacaaac 1680 

tgtaagaaaa agaaatttcc cattttatat tacacatatt cgagcacaca ctaatttacc 1740 

agggcctttg actaaagcaa atgaacaagc tgacttactg gtatcatctg cactcataaa 1800 

agcacaagaa cttcatgctt tgactcatgt aaatgcagca ggattaaaaa acaaatttga 1860 

tgtcacatgg aaacaggcaa aagatattgt acaacattgc acccagtgtc aagtcttaca 1920 

cctgcccact caagaggcag gagttaatcc cagaggtctg tgtcctaatg cattatggca 1980 

aatggatgtc acgcatgtac cttcatttgg aagattatca tatgttcacg taacagttga 2040 

tacttattca catttcatat gggcaacttg ccaaacagga gaaagtactt cccatgttaa 2100 

aaaacattta ttgtcttgtt ttgctgtaat gggagttcca gaaaaaatca aaactgacaa 2160 

tggaccagga tattgtagta aagctttcca aaaattctta agtcagtgga aaatttcaca 2220 

tacaacagga attccttata attcccaagg acaggccata gttgaaagaa ctaatagaac 2280 

actcaaaact caattagtta aacaaaaaga agggggagac agtaaggagt gtaccactcc 2340 

tcagatgcaa cttaatctag cactctatac tttaaatttt ttaaacattt atagaaatca 2400 

gactactact tctgcagaac aacatcttac tggtaaaaag aacagcccac atgaaggaaa 2460 

actaatttgg tggaaagata ataaaaataa gacatgggaa atagggaagg tgataacgtg 2520 

ggggagaggt tttgcttgtg tttcaccagg agaaaatcag cttcctgttt ggatacccac 2580 

tagacatttg aagttctaca atgaacccat cagagatgca aagaaaagca cctccgcgga 2640 

gacggagaca tcgcaatcga gcaccgttga ctcacaagat gaacaaaatg gtgacgtcag 2700 

aagaacagat gaagttgcca tccaccaaga aggcagagcc gccaacttgg gcacaactaa 2760 

agaagctgac gcagttagct acaaaatatc tagagaacac aaaggtgaca caaaccccag 2820 

agagtatgct gcttgcagcc ttgatgattg tatcaatggt ggtaagtctc cctatgcctg 2880 

caggagcagc tgcagc 2896 

<210> 16 
<211> 2619 
<212> DNA 

<213> Human endogenous retrovirus, K family (HERV-K) 
<400> 16 

atgttaactg acttaagggc tgtaaacgcc gtaattcaac ccatggggcc tctccaaccc 60 

gggttgccct ctccggccat gatcccaaaa gattggcctt taattataat tgatctaaag 120 

gattgctttt ttaccatccc tctggcagag caggattgcg aaaaatttgc ctttactata 180 

ccagccataa ataataaaga accagccacc aggtttcagt ggaaagtgtt acctcaggga 240 

atgcttaata gtccaactat ttgtcagact tttgtaggtc gagctcttca accagttaga 300 

gaaaagtttt cagactgtta tattattcat tgtattgatg atattttatg tgctgcagaa 360 

acgaaagata aattaattga ctgttataca tttctgcaag cagaggttgc caatgctgga 420 

ctggcaatag catctgataa gatccaaacc tctactcctt ttcattattt agggatgcag 480 

atagaaaata gaaaaattaa gccacaaaaa atagaaataa gaaaagacac attaaaaaca 540 

ctaaatgatt ttcaaaaatt actaggagat attaattgga ttcggccaac tctaggcatt 600 

cctacttatg ccatgtcaaa tttgttctct atcttaagag gagactcaga cttaaatagt 660 
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aaaagaatgt taaccccaga ggcaacaaaa gaaattaaat tagtggaaga aaaaattcag 720 

tcagcgcaaa taaatagaat agatccctta gccccactcc aacttttgat ttttgccact 780 

gcacattctc caacaggcat cattattcaa aatactgatc ttgtggagtg gtcattcctt 840 

cctcacagta cagttaagac ttttacattg tacttggatc aaatagctac attaatcggt 900 

cagacaagat tacgaataat aaaattatgt gggaatgacc cagacaaaat agttgtccct 960 

ttaaccaagg aacaagttag acaagccttt atcaattctg gtgcatggaa gattggtctt 1020 

gctaattttg tgggaattat tgataatcat tacccaaaaa caaagatctt ccagttctta 1080 

aaattgacta cttggattct acctaaaatt accagacgtg aacctttaga aaatgctcta 1140 

acagtattta ctgatggttc cagcaatgga aaagcagctt acacaggacc gaaagaacga 1200 

gtaatcaaaa ctccatatca atcggctcaa agagcagagt tggttgcagt cattacagtg 1260 

ttacaagatt ttgaccaacc tatcaatatt atatcagatt ctgcatatgt agtacaggct 1320 

acaagggatg ttgagacagc tctaattaaa tatagcatgg atgatcagtt aaaccagcta 1380 

ttcaatttat tacaacaaac tgtaagaaaa agaaatttcc cattttatat tacacatatt 1440 

cgagcacaca ctaatttacc agggcctttg actaaagcaa atgaacaagc tgacttactg 1500 

gtatcatctg cactcataaa agcacaagaa cttcatgctt tgactcatgt aaatgcagca 1560 

ggattaaaaa acaaatttga tgtcacatgg aaacaggcaa aagatattgt acaacattgc 1620 

acccagtgtc aagtcttaca cctgcccact caagaggcag gagttaatcc cagaggtctg 1680 

tgtcctaatg cattatggca aatggatgtc acgcatgtac cttcatttgg aagattatca 1740 

tatgttcacg taacagttga tacttattca catttcatat gggcaacttg ccaaacagga 1800 

gaaagtactt cccatgttaa aaaacattta ttgtcttgtt ttgctgtaat gggagttcca 1860 

gaaaaaatca aaactgacaa tggaccagga tattgtagta aagctttcca aaaattctta 1920 

agtcagtgga aaatttcaca tacaacagga attccttata attcccaagg acaggccata 1980 

gttgaaagaa ctaatagaac actcaaaact caattagtta aacaaaaaga agggggagac 2040 

agtaaggagt gtaccactcc tcagatgcaa cttaatctag cactctatac tttaaatttt 2100 

ttaaacattt atagaaatca gactactact tctgcagaac aacatcttac tggtaaaaag 2160 

aacagcccac atgaaggaaa actaatttgg tggaaagata gtaaaaataa gacatgggaa 2220 

atagggaagg tgataacgtg ggggagaggt tttgcttgtg tttcaccagg agaaaatcag 2280 

cttcctgttt ggatacccac tagacatttg aagttctaca atgaacccat cagagatgca 2340 

aagaaaagca cctccgcgga gacggagaca tcgcaatcga gcaccgttga ctcacaagat 2400 

gaacaaaatg gtgacgtcag aagaacagat gaagttgcca tccaccaaga aggcagagcc 2460 

gccaacttgg gcacaactaa agaagctgac gcagttagct acaaaatatc tagagaacac 2520 

aaaggtgaca caaaccccag agagtatgct gcttgcagcc ttgatgattg tatcaatggt 2580 

ggtaagtctc cctatgcctg caggagcagc tgcagctaa 2619 

<210> 17 
<211> 2671 
<212> DNA 

<213> Human endogenous retrovirus, K family (herv-k) 
<400> 17 

atggcattaa agtcccaact gaggctgaaa aaaatcaaaa aaagaaaagg aatagggcat 60 

cctttttaga agcggtcact gtagagcctc caaaacccat tccattaatt tggggggaaa 120 

aaaaaaactg tatggtaaat cagtagccgc ttccaaaaca aaaactggag gctttacact 180 

tattagcaaa gaaacagtta gaaaaaggac atattgagcc ttcattttcg ccttggaatt 240 

ctcctgtttg taattcagaa aaaatccggc agatggcgta tgctaactga cttaagagcc 300 

attaatgcca taattcaacc catgggggct ctcccatccc ggttgccctc tccagccatg 360 

gtccccttta attataattg atctgaagga ttgctttttt accattcctc tggcaaaaga 420 

ggattttgaa aaatttgctt ttactatacc agcctaaata ataaagaacc agccaccagg 480 

tttcagtgga aagtattgcc tcagggaatg cttaataatt caactatttg tcagactttc 540 

atagctcaag ctctgcaacc agttagagac aagttttcag actgttatat cgttcattat 600 

gttgatattt tgtgtgctgc agaaacgaga gacaaattaa ttgaccgtta cacatttctc 660 

agacagaggt tgccaacgcg ggactgacaa tagcatctga taagattcaa acctctcctc 720 

ctttccatta cttgggaatg caggtagagg aaaggaaaat taaaccacaa aaaatagaaa 780 

taagaaaaga cacattaaaa acattaaatg agtttcaaaa gttggtagga gatactaatt 840 

ggattcggag atattaattg gatttggcca actctaggca ttcctactta tgccatgtca 900 

attttgttct ctttcttaag aggggacttg gaattaaata gtgaaagaat gttacctcca 960 

gaggcaacta aagaaattaa attaattgaa gaaaaaaatt cggtcagcac aagtaaatag 1020 

gatcacttgg ccccactcca aattttgatt tttggtactg cacattctct aacagccatc 1080 

attgttcaaa acacagatct tgtggattgg tccttccttc ctcatagtac aattaagact 1140 

tttacattgt acttggatca aatggctaca ttaattggtc agggaagatt acgaataata 1200 

acattgtgtg gaaatgaccc agataaaatc actgttcctt tcaacaagca acaagttaga 1260 

caagccttta tcagttctgg tgcatggcag attggtcttg ctaattttct gggaattatt 1320 

gataatcatt acccaaaaac aaaaatcttc cagttcttaa aattgactac ttggattcta 1380 
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cctaaaatta ccagacgtga acctttagaa aatgctctaa cagtatttac tgatggttcc 
agcaatggaa aagcggctta cacagggccg aaagaacgag taatcaaaac tccgtatcaa 

tcagctcaaa gagcagagtt ggttgcagtc attacagtgt tacaagattt tgaccaacct 

atcaatatta tatcagattc tgcatatgta gtacaggcta caagggatgt tgagacagct 

ctaattaaat atagcacgga cgatcattta aaccagctat tcaatttatt acaacaaact 

gtaagaaaaa gaaatttccc attttatatt actcatattc gagcacacac taatttacca 

gggcctttga ctaaagcaaa tgaacaagct gacttactgg tatcatctgc attcataaaa 

gcacaagaac ttcttgcttt gactcatgta aatgcagcag gattaaaaaa caaatttgat 

gtcacatgga aacaggcaaa agatattgta caacattgca cccagtgtca agtcttacac 

ctgtccactc aagaggcagg agttaatccc agaggtctgt gtcctaatgc gttatggcaa 

atggatggca cgcatgttcc ttcatttgga agattatcat atgttcatgt aacagttgat 

acttattcac atttcatatg ggcaacttgc caaacaggag aaagtacttc ccatgttaaa 

aaacatttat tatcttgttt tgctgtaatg ggagttccag aaaaaatcaa aactgacaat 

ggaccaggat attgtagtaa agctttccaa aaattcttaa gtcagtggaa aatttcacat 

acaacaggaa ttccttataa ttcccaagga caggccatag ttgaaagaac taatagaaca 

ctcaaaactc aattagttaa acaaaaagaa gggggagaca gtaaggagtg taccactcct 

cagatgcaac ttaatctagc actctatact ttaaattttt taaacattta tagaaatcag 

actactactt ctgcaaaaca acatcttact ggtaaaaagc acagcccaca tgaaggaaaa 

ctaatttggt ggaaagataa taaaaataag acatgggaaa tagggaaggt gataacgtgg 

gggagaggtt ttgcttgtgt ttcaccagga gaaaatcagc ttcctgtttg gatacccact 

agacatttga agttctacaa tgaacccatc ggagatgcaa agaaaagggc ctccacagag 
atggtaaccc cagtcacatg gatggataat c 

<210> 18 
<211> 4086 
<212> DNA 

<213> Human endogenous retrovirus, K family (HERV-K) 
<400> 18 

atggggcctc tccaacccgg gttgccctct ccggccatga tcccaaaaga ttggccttta 
attataattg atctaaagga ttgctttttt accatccctc tggcagagca ggattgtgaa 
aaatttgcct ttactatacc agccataaat aataaagaac cagccaccag gtttcagtgg 
aaagtgttac ctcagggaat gcttaatagt ccaactattt gtcagacttt tgtaggtcga 
gctcttcaac cagtgagaga aaagttttca gactgttata ttattcatta tattgatgat 
attttatgtg ctgcagaaac gaaagataaa ttaattgact gttatacatt tctgcaagca 
gaggttgcca atgctggact ggcaatagca tccgataaga tccaaacctc tactcctttt 
cattatttag ggatgcagat agaaaataga aaaattaagc cacaaaaaat agaaataaga 
aaagacacat taaaaacact aaatgatttt caaaaattac taggagatat taattggatt 
cggccaactc taggcattcc tacttatgcc atgtcaaatt tgttctctat cttaagagga 
gactcagact taaatagtca aagaatatta accccagagg caacaaaaga aattaaatta 
gtggaagaaa aaattcagtc agcgcaaata aatagaatag atcccttagc cccactccaa 
cttttgattt ttgccactgc acattctcca acaggcatca ttattcaaaa tactgatctt 
gtggagtggt cattccttcc tcacagtaca gttaagactt ttacattgta cttggatcaa 
atagctacat taatcggtca gacaagatta cgaataacaa aattatgtgg aaatgaccca 
gacaaaatag ttgtcccttt aaccaaggaa caagttagac aagcctttat caattctggt 
gcatggcaga ttggtcttgc taattttgtg ggacttattg ataatcatta cccaaaaaca 
aagatcttcc agttcttaaa attgactact tggattctac ctaaaattac cagacgtgaa 
cctttagaaa atgctctaac agtatttact gatggttcca gcaatggaaa agcagcttac 
acagggccga aagaacgagt aatcaaaact ccatatcaat cggctcaaag agacgagttg 
gttgcagtca ttacagtgtt acaagatttt gaccaaccta tcaatattat atcagattct 
gcatatgtag tacaggctac aagggatgtt gagacagctc taattaaata tagcatggat 
gatcagttaa accagctatt caatttatta caacaaactg taagaaaaag aaatttccca 
ttttatatta cttatattcg agcacacact aatttaccag ggcctttgac taaagcaaat 
gaacaagctg acttactggt atcatctgca ctcataaaag cacaagaact tcatgctttg 
actcatgtaa atgcagcagg attaaaaaac aaatttgatg tcacatggaa acaggcaaaa 
gatattgtac aacattgcac ccagtgtcaa gtcttacacc tgcccactca agaggcagga 
gttaatccca gaggtctgtg tcctaatgca ttatggcaaa tggatgtcac gcatgtacct 
tcatttggaa gattatcata tgttcatgta acagttgata cttattcaca tttcatatgg 
gcaacttgcc aaacaggaga aagtacttcc catgttaaaa aacatttatt gtcttgtttt 
gctgtaatgg gagttccaga aaaaatcaaa actgacaatg gaccaggata ttgtagtaaa 
gctttccaaa aattcttaag tcagtggaaa atttcacata caacaggaat tccttataat 
tcccaaggac aggccatagt tgaaagaact aatagaacac tcaaaactca attagttaaa 
caaaaagaag ggggagacag taaggagtgt accactcctc agatgcaact taatctagca 
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ctctatactt taaatttttt aaacatttat agaaatcaga ctactacttc tgcagaacaa 2100 

catcttactg gtaaaaagaa cagcccacat gaaggaaaac taatttggtg gaaagataat 2160 

aaaaataaga catgggaaat agggaaggtg ataacgtggg ggagaggttt tgcttgtgtt 2220 

tcaccaggag aaaatcagct tcctgtttgg ttacccacta gacatttgaa gttctacaat 2280 

gaacccatcg gagatgcaaa gaaaagggcc tccacggaga tggtaacacc agtcacatgg 2340 

atggataatc ctatagaagt atatgttaat gatagtatat gggtacctgg ccccatagat 2400 

gatcgctgcc ctgccaaacc tgaggaagaa gggatgatga taaatatttc cattgggtat 2460 

cgttatcctc ctatttgcct agggagagca ccaggatgtt taatgcctgc agtccaaaat 2520 

tggttggtag aagtacctac tgtcagtccc atcagtagat tcacttatca catggtaagc 2580 

gggatgtcac tcaggccacg ggtaaattat ttacaagact tttcttatca aagatcatta 2640 

aaatttagac ctaaagggaa accttgcccc aaggaaattc ccaaagaatc aaaaaataca 2700 

gaagttttag tttgggaaga atgtgtggcc aatagtgcgg tgatattata aaacaatgaa 2760 

tttggaacta ttatagattg ggcacctcga ggtcaattct accacaattg ctcaggacaa 2820 

actcagtcgt gtccaagtgc acaagtgagt ccagctgttg atagcgactt aacagaaagt 2880 

ttagacaaac ataagcataa aaaattgcag tctttctacc cttgggaatg gggagaaaaa 2940 

ggaatctcta ccccaagacc aaaaatagta agtcctgttt ctggtcctga acatccagaa 3000 

ttatggaggc ttactgtggc ctcacaccac attagaattt ggtctggaaa tcaaacttta 3060 

gaaacaagag attgtaagcc attttatact gtcgacctaa attccagtct aacagttcct 3120 

ttacaaagtt gcgtaaagcc cccttatatg ctagttgtag gaaatatagt tattaaacca 3180 

gactcccaga ctataacctg tgaaaattgt agattgctta cttgcattga ttcaactttt 3240 

aattggcaac accgtattct gctggtgaga gcaagagagg gcgtgtggat ccctgtgtcc 3300 

atggaccgac cgtgggaggc ctcaccatcc gtccatattt tgactgaagt attaaaaggt 3360 

gttttaaata gatccaaaag attcattttt actttaattg cagtgattat gggattaatt 3420 

gcagtcacag ctacggctgc tgtagcagga gttgcattgc actcttctgt tcagtcagta 3480 

aactttgtta atgattggca aaagaattct acaagattgt ggaattcaca atctagtatt 3540 

gatcaaaaat tggcaaatca aattaatgat cttagacaaa ctgtcatttg gatgggagac 3600 

agactcatga gcttagaaca tcgtttccag ttacaatgtg actggaatac gtcagatttt 3660 

tgtattacac cccaaattta taatgagtct gagcatcact gggacatggt tagacgccat 3720 

ctacagggaa gagaagataa tctcacttta gacatttcca aattaaaaga acaaattttc 3780 

gaagcatcaa aagcccattt aaatttggtg ccaggaactg aggcaattgc aggagttgct 3840 

gatggcctcg caaatcttaa ccctgtcact tgggttaaga ccattggaag tacatcgatt 3900 

ataaatctca tattaatcct tgtgtgcctg ttttgtctgt tgttagtctg caggtgtacc 3960 

caacagctcc gaagagacag cgaccatcga gaacgggcca tgatgacgat ggcggttttg 4020 

tcgaaaagaa aagggggaaa tgtggggaaa agcaagagag atcaaattgt tactgtgtct 4080 

gtgtag 4086 

<210> 19 
<211> 872 
<212> PRT 

<213> Human endogenous retrovirus, K family (herv-k) 
<400> 19 

Met Leu Thr Asp Leu Arg Ala val Asn Ala val lie Gin Pro Met Gly 
15 10 15 

Pro Leu Gin Pro Gly Leu Pro Ser Pro Ala Met lie Pro Lys Asp Trp 
20 25 30 

Pro Leu lie lie lie Asp Leu Lys Asp Cys Phe Phe Thr lie Pro Leu 
35 40 45 

Ala Glu Gin Asp Cys Glu Lys Phe Ala Phe Thr lie Pro Ala lie Asn 
50 55 60 

Asn Lys Glu Pro Ala Thr Arg Phe Gin Trp Lys val Leu Pro Gin Gly 
65 70 75 80 

Met Leu Asn Ser Pro Thr lie Cys Gin Thr Phe val Gly Arg Ala Leu 
85 90 95 

Gin Pro val Arg Glu Lys Phe Ser Asp Cys Tyr lie lie His cys lie 
100 105 110 
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Asp Asp lie Leu cys Ala Ala Glu Thr Lys Asp Lys Leu lie Asp Cys 
115 120 125 

Tyr Thr Phe Leu Gin Ala Glu val Ala Asn Ala Gly Leu Ala lie Ala 

130 135 140 

Ser Asp Lys lie Gin Thr Ser Thr Pro Phe His Tyr Leu Gly Met Gin 
145 150 155 160 

lie Glu Asn Arg Lys lie Lys Pro Gin Lys lie Glu lie Arg Lys Asp 
165 170 175 

Thr Leu Lys Thr Leu Asn Asp Phe Gin Lys Leu Leu Gly Asp lie Asn 
180 185 190 

Trp lie Arg Pro Thr Leu Gly lie Pro Thr Tyr Ala Met Ser Asn Leu 
195 200 205 

Phe Ser lie Leu Arg Gly Asp Ser Asp Leu Asn ser Lys Arg Met Leu 
210 215 220 

Thr Pro Glu Ala Thr Lys Glu lie Lys Leu val Glu Glu Lys lie Gin 

225 230 235 240 

Ser Ala Gin lie Asn Arg lie Asp Pro Leu Ala Pro Leu Gin Leu Leu 
245 250 255 

lie Phe Ala Thr Ala His Ser Pro Thr Gly lie lie lie Gin Asn Thr 
260 265 270 

Asp Leu val Glu Trp ser Phe Leu Pro His Ser Thr val Lys Thr Phe 
275 280 285 

Thr Leu Tyr Leu Asp Gin lie Ala Thr Leu lie Gly Gin Thr Arg Leu 
290 295 300 

Arg lie lie Lys Leu Cys Gly Asn Asp Pro Asp Lys He Val val Pro 
305 310 315 320 

Leu Thr Lys Glu Gin val Arg Gin Ala Phe lie Asn Ser Gly Ala Trp 

325 330 335 

Lys lie Gly Leu Ala Asn Phe val Gly lie lie Asp Asn His Tyr Pro 
340 345 350 

Lys Thr Lys lie Phe Gin Phe Leu Lys Leu Thr Thr Trp lie Leu Pro 
355 360 365 

Lys He Thr Arg Arg Glu Pro Leu Glu Asn Ala Leu Thr val Phe Thr 
370 375 380 

Asp Gly ser ser Asn Gly Lys Ala Ala Tyr Thr Gly Pro Lys Glu An 
385 390 395 40 

Val lie Lys Thr Pro Tyr Gin Ser Ala Gin Arg Ala Glu Leu val Ala 
405 410 415 

Val lie Thr val Leu Gin Asp Phe Asp Gin Pro lie Asn lie lie Ser 
420 425 430 

Asp Ser Ala Tyr val val Gin Ala Thr Arg Asp val Glu Thr Ala Leu 
435 440 445 
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lie Lys Tyr Ser Met Asp Asp Gin Leu Asn Gin Leu Phe Asn Leu Leu 
450 455 460 

Gin Gin Thr Val Arg Lys Arg Asn Phe Pro Phe Tyr lie Thr His lie 
465 470 475 480 

Arg Ala His Thr Asn Leu Pro Gly Pro Leu Thr Lys Ala Asn Glu Gin 
485 490 495 

Ala Asp Leu Leu val Ser ser Ala Leu lie Lys Ala Gin Glu Leu His 
500 505 510 

Ala Leu Thr His val Asn Ala Ala Gly Leu Lys Asn Lys Phe Asp val 

515 520 525 

Thr Trp Lys Gin Ala Lys Asp lie Val Gin His Cys Thr Gin Cys Gin 
530 535 540 

val Leu His Leu Pro Thr Gin Glu Ala Gly val Asn Pro Arg Gly Leu 
545 550 555 560 

Cys Pro Asn Ala Leu Trp Gin Met Asp Val Thr His val Pro ser Phe 

565 570 575 

Gly Arg Leu Ser Tyr val His val Thr val Asp Thr Tyr Ser His Phe 
580 585 590 

lie Trp Ala Thr Cys Glh Thr Gly Glu Ser Thr Ser His Val Lys Lys 
595 600 605 

His Leu Leu Ser Cys Phe Ala Val Met Gly val Pro Glu Lys lie Lys 
610 615 620 

Thr Asp Asn Gly Pro Gly Tyr Cys Ser Lys Ala Phe Gin Lys Phe Leu 
625 630 635 640 

Ser Gin Trp Lys lie ser His Thr Thr Gly lie Pro Tyr Asn ser Gin 
645 650 655 

Gly Gin Ala lie val Glu Arg Thr Asn Arg Thr Leu Lys Thr Gin Leu 
660 665 670 

val Lys Gin Lys Glu Gly Gly Asp Ser Lys Glu Cys Thr Thr Pro Gin 
675 680 685 

Met Gin Leu Asn Leu Ala Leu Tyr Thr Leu Asn Phe Leu Asn lie Tyr 
690 695 ' 700 

Arg Asn Gin Thr Thr Thr ser Ala Glu Gin His Leu Thr Gly Lys Lys 

705 710 715 720 

Asn Ser Pro His Glu Gly Lys Leu lie Trp Trp Lys Asp Ser Lys Asn 
725 730 735 

Lys Thr Trp Glu lie Gly Lys val lie Thr Trp Gly Arg Gly Phe Ala 
740 745 750 

Cys Val ser Pro Gly Glu Asn Gin Leu Pro Val Trp lie Pro Thr Arg 

755 760 765 

His Leu Lys Phe Tyr Asn Glu Pro lie Arg Asp Ala Lys Lys Ser Thr 
770 775 780 
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ser Ala Glu Thr Glu Thr ser Gin Ser Ser Thr val Asp Ser Gin Asp 

785 790 795 800 

Glu Gin Asn Gly Asp val Arg Arg Thr Asp Glu val Ala lie His Gin 
805 810 815 

Glu Gly Arg Ala Ala Asn Leu Gly Thr Thr Lys Glu Ala Asp Ala val 
820 825 830 

Ser Tyr Lys lie Ser Arg Glu His Lys Gly Asp Thr Asn Pro Arg Glu 
835 840 845 

Tyr Ala Ala cys ser Leu Asp Asp Cys lie Asn Gly Gly Lys ser Pro 

850 855 860 
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180 185 190 

Asn Leu Phe Ser lie Leu Arg Gly Asp ser Asp Leu Asn Ser Gin Arg 
195 200 205 

lie Leu Thr Pro Glu Ala Thr Lys Glu lie Lys Leu val Glu Glu Lys 

210 215 220 

lie Gin Ser Ala Gin lie Asn Arg He Asp Pro Leu Ala Pro Leu Gin 
225 230 235 240 

Leu Leu lie Phe Ala Thr Ala His ser Pro Thr Gly lie lie lie Gin 
245 250 255 

Asn Thr Asp Leu val Glu Trp Ser Phe Leu Pro His Ser Thr val Lys 
260 265 270 

Thr Phe Thr Leu Tyr Leu Asp Gin He Ala Thr Leu lie Gly Gin Thr 
275 280 285 

Arg Leu Arg lie Thr Lys Leu Cys Gly Asn Asp Pro Asp Lys lie val 
290 295 300 

Val Pro Leu Thr Lys Glu Gin val Arg Gin Ala Phe lie Asn Ser Gly 

305 310 315 320 

Ala Trp Gin lie Gly Leu Ala Asn Phe val Gly Leu lie Asp Asn His 
325 330 335 

Tyr Pro Lys Thr Lys lie Phe Gin Phe Leu Lys Leu Thr Thr Trp lie 
340 345 350 

Leu Pro Lys lie Thr Arg Arg Glu Pro Leu Glu Asn Ala Leu Thr val 

355 360 365 

Phe Thr Asp Gly Ser Ser Asn Gly Lys Ala Ala Tyr Thr Gly Pro Lys 
370 375 380 

Glu Arg val lie Lys Thr Pro Tyr Gin Ser Ala Gin Arg Asp Glu Leu 
385 390 395 400 

val Ala val lie Thr val Leu Gin Asp Phe Asp Gin Pro lie Asn lie 
405 410 415 

lie Ser Asp Ser Ala Tyr val val Gin Ala Thr Arg Asp val Glu Thr 
420 425 430 

Ala Leu lie Lys Tyr Ser Met Asp Asp Gin Leu Asn Gin Leu Phe Asn 
435 440 445 

Leu Leu Gin Gin Thr val Arg Lys Arg Asn Phe Pro Phe Tyr lie Thr 
450 455 460 

Tyr lie Arg Ala His Thr Asn Leu Pro Gly Pro Leu Thr Lys Ala Asn 
465 470 475 480 

Glu Gin Ala Asp Leu Leu val Ser Ser Ala Leu lie Lys Ala Gin Glu 
485 490 495 

Leu His Ala Leu Thr His val Asn Ala Ala Gly Leu Lys Asn Lys Phe 
500 505 510 

Asp val Thr Trp Lys Gin Ala Lys Asp He val Gin His cys Thr Gin 
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515 520 525 

Cys Gin val Leu His Leu Pro Thr Gin Glu Ala Gly val Asn Pro Arg 
530 535 540 

Gly Leu Cys Pro Asn Ala Leu Trp Gin Met Asp val Thr His val Pro 
545 550 555 560 

Ser Phe Gly Arg Leu Ser Tyr val His Val Thr Val Asp Thr Tyr Ser 
565 570 575 

His Phe lie Trp Ala Thr cys Gin Thr Gly Glu Ser Thr Ser His Val 
580 585 590 

Lys Lys His Leu Leu Ser Cys Phe Ala val Met Gly val Pro Glu Lys 
595 600 605 

lie Lys Thr Asp Asn Gly Pro Gly Tyr Cys Ser Lys Ala Phe Gin Lys 
610 615 620 

Phe Leu Ser Gin Trp Lys lie Ser His Thr Thr Gly lie Pro Tyr Asn 
625 630 635 640 

Ser Gin Gly Gin Ala lie val Glu Arg Thr Asn Arg Thr Leu Lys Thr 
645 650 655 

Gin Leu val Lys Gin Lys Glu Gly Gly Asp Ser Lys Glu Cys Thr Thr 
660 665 670 

Pro Gin Met Gin Leu Asn Leu Ala Leu Tyr Thr Leu Asn Phe Leu Asn 
675 680 685 

lie Tyr Arg Asn Gin Thr Thr Thr Ser Ala Glu Gin His Leu Thr Gly 
690 695 700 

Lys Lys Asn ser Pro His Glu Gly Lys Leu lie Trp Trp Lys Asp Asn 
705 710 715 720 

Lys Asn Lys Thr Trp Glu lie Gly Lys val lie Thr Trp Gly Arg Gly 
725 730 735 

Phe Ala Cys val Ser Pro Gly Glu Asn Gin Leu Pro Val Trp Leu Pro 
740 745 750 

Thr Arg His Leu Lys Phe Tyr Asn Glu Pro lie Gly Asp Ala Lys Lys 
755 760 765 

Arg Ala Ser Thr Glu Met val Thr Pro val Thr Trp Met Asp Asn Pro 
770 775 780 

lie Glu val Tyr Val Asn Asp Ser He Trp Val Pro Gly Pro lie Asp 
785 790 795 800 

Asp Arg cys Pro Ala Lys Pro Glu Glu Glu Gly Met Met He Asn lie 
805 810 815 

Ser lie Gly Tyr Arg Tyr Pro Pro lie Cys Leu Gly Arg Ala Pro Gly 
820 825 830 

Cys Leu Met Pro Ala val Gin Asn Trp Leu val Glu val Pro Thr val 
835 840 845 

ser Pro lie ser Arg phe Thr Tyr His Met val Ser Gly Met Ser Leu 
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850 855 860 

Arg. Pro Arg val Asn Tyr Leu Gin Asp Phe Sen Tyr Gin Arg ser Leu 
865 870 875 880 

Lys Phe Arg Pro Lys Gly Lys Pro Cys Pro Lys Glu lie Pro Lys Glu 
885 890 895 

ser Lys Asn Thr Glu Val Leu val Trp Glu Glu Cys val Ala Asn ser 
900 905 910 

Ala val lie Leu Xaa Asn Asn Glu Phe Gly Thr lie lie Asp Trp Ala 
915 920 925 

Pro Arg Gly Gin Phe Tyr His Asn Cys Ser Gly Gin Thr Gin Ser Cys 
930 935 940 

Pro Ser Ala Gin val ser Pro Ala val Asp Ser Asp Leu Thr Glu ser 
945 950 955 960 

Leu Asp Lys His Lys His Lys Lys Leu Gin ser Phe Tyr Pro Trp Glu 
965 970 975 

Trp Gly Glu Lys Gly lie Ser Thr Pro Arg Pro Lys lie val Ser Pro 
980 985 990 

Val Ser Gly Pro Glu His Pro Glu Leu Trp Arg Leu Thr val Ala Ser 
995 1000 1005 

His His lie Arg lie Trp ser Gly Asn Gin Thr Leu Glu Thr Arg Afsp 
1010 1015 1020 

Cys Lys Pro Phe Tyr Thr val Asp Leu Asn Ser Ser Leu Thr Val Pro 
1025 1030 1035 1040 

Leu Gin Ser cys val Lys Pro Pro Tyr' Met Leu val val Gly Asn lie 
1045 1050 1055 

val lie Lys Pro Asp Ser Gin Thr lie Thr Cys Glu Asn Cys Arg Leu 
1060 1065 1070 

Leu Thr Cys lie Asp Ser Thr Phe Asn Trp Gin His Arg lie Leu Leu 
1075 1080 1085 

Val Arg Ala Arg Glu Gly val Trp lie Pro val ser Met Asp Arg Pro 
1090 1095 1100 

Trp Glu Ala ser Pro Ser val His lie Leu Thr Glu val Leu Lys Gly 
1105 1110 1115 1120 

val Leu Asn Arg ser Lys Arg Phe lie Phe Thr Leu lie Ala val lie 
1125 1130 1135 

Met Gly Leu lie Ala val Thr Ala Thr Ala Ala val Ala Gly val Ala 
1140 1145 1150 

Leu His ser ser val Gin ser val Asn Phe Val Asn Asp Trp Gin Lys 
1155 1160 1165 

Asn Ser Thr Arg Leu Trp Asn Ser Gin Ser Ser lie Asp Gin Lys Leu 
1170 1175 1180 

Ala Asn Gin lie Asn Asp Leu Arg Gin Thr val lie Trp Met Gly Asp 
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1185 1190 1195 1200 

Arg Leu Met ser Leu Glu His Arg Phe Gin Leu Gln Cys Asp Trp Asn 
1205 1210 1215 

Thr Ser Asp Phe cys lie Thr Pro Gin lie Tyr Asn Glu ser Glu His 
1220 1225 1230 

His Trp Asp Met Val Arg Arg His Leu Gin Gly Arg Glu Asp Asn Leu 
1235 1240 1245 

Thr Leu Asp lie ser Lys Leu Lys Glu Gin lie Phe Glu Ala ser Lys 
1250 1255 1260 

Ala His Leu Asn Leu val Pro Gly Thr Glu Ala lie Ala Gly val Ala 
1265 1270 1275 1280 

Asp Gly Leu Ala Asn Leu Asn Pro val Thr Trp val Lys Thr lie Gly 
1285 1290 1295 

Ser Thr ser lie lie Asn Leu lie Leu lie Leu val Cys Leu Phe Cys 
1300 1305 1310 

Leu Leu Leu val Cys Arg Cys Thr Gin Gin Leu Arg Arg Asp Ser Asp 
1315 1320 1325 

His Arg Glu Arg Ala Met Met Thr Met Ala val Leu Ser Lys Arg Lys 
1330 1335 1340 

Gly Gly Asn val Gly Lys Ser Lys Arg Asp Gin lie val Thr val Ser 
1345 1350 1355 1360 

val 

<210> 21 
<211> 956 
<212> PRT 

<213> Human endogenous retrovirus, K family (herv-k) 
<400> 21 

Asn Lys Ser Arg Lys Arg Arg Asn Arg Glu Ser Leu Leu Gly Ala Ala 
15 10 15 

Thr val Glu Pro Pro Lys Pro lie Pro Leu Thr Trp Lys Thr Glu Lys 
20 25 30 

Pro val Trp val Asn Gin Trp Pro Leu Pro Lys Gin Lys Leu Glu Ala 
35 40 45 

Leu His Leu Leu Ala Asn Glu Gin Leu Glu Lys Gly His lie Glu Pro 
50 55 60 

Ser Phe Ser Pro Trp Asn Ser Pro val Phe val lie Gin Lys Lys Ser 
65 70 75 80 

Gly Lys Trp Arg Met Leu Thr Asp Leu Arg Ala val Asn Ala val lie 
85 90 95 

Gin Pro Met Gly Pro Leu Gin Pro Gly Leu Pro Ser Pro Ala Met lie 
100 105 110 

Pro Lys Asp Trp Pro Leu lie lie lie Asp Leu Lys Asp Cys Phe Phe 
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120 



10587032_PP019482 . 007 
125 



Thr lie Pro Leu Ala Glu Gin Asp Cys Glu Lys Phe Ala Phe Thr lie 
130 135 140 

Pro Ala lie Asn Asn Lys Glu Pro Ala Thr Arg Phe Gin Trp Lys Val 

145 * 150 155 160 

Leu Pro Gin Gly Met Leu Asn Ser Pro Thr He Cys Gin Thr Phe val 
165 170 175 

Gly Arg Ala Leu Gin Pro Val Arg Glu Lys Phe Ser Asp Cys Tyr lie 
180 185 190 

lie His Cys lie Asp Asp lie Leu Cys Ala Ala Glu Thr Lys Asp Lys 
195 200 205 

Leu lie Asp Cys Tyr Thr Phe Leu Gin Ala Glu val Ala Asn Ala Gly 
210 215 220 

Leu Ala lie Ala ser Asp Lys lie Gin Thr Ser Thr Pro Phe His Tyr 
225 230 235 240 

Leu Gly Met Gin lie Glu Asn Arg Lys lie Lys Pro Gin Lys lie Glu 

245 250 255 

lie Arg Lys Asp Thr Leu Lys Thr Leu Asn Asp Phe Gin Lys Leu Leu 
. 260 265 270 

Gly Asp lie Asn Trp lie Arg Pro Thr Leu Gly lie Pro Thr Tyr Ala 
275 280 285 

Met ser Asn Leu Phe ser lie Leu Arg Gly Asp ser Asp Leu Asn ser 

290 295 300 

Lys Arg Met Leu Thr Pro Glu Ala Thr Lys Glu lie Lys Leu val Glu 
305 310 315 320 

Glu Lys lie Gin Ser Ala Gin lie Asn Arg lie Asp Pro Leu Ala Pro 
325 330 335 

Leu Gin Leu Leu lie Phe Ala Thr Ala His Ser Pro Thr Gly lie lie 
340 345 350 

lie Gin Asn Thr Asp Leu val Glu Trp ser Phe Leu Pro His Ser Thr 
355 360 365 

Val Lys Thr Phe Thr Leu Tyr Leu Asp Gin lie Ala Thr Leu lie Gly 
370 375 380 

Gin Thr Arg Leu Arg lie lie Lys Leu Cys Gly Asn Asp Pro Asp Lys 
385 390 395 400 

lie val val Pro Leu Thr Lys Glu Gin val Arg Gin Ala Phe lie Asn 
405 410 415 

Ser Gly Ala Trp Lys lie Gly Leu Ala Asn Phe val Gly lie lie Asp 
420 425 430 

Asn His Tyr Pro Lys Thr Lys lie Phe Gin Phe Leu Lys Leu Thr Thr 
435 440 445 

Trp lie Leu Pro Lys lie Thr Arg Arg Glu Pro Leu Glu Asn Ala Leu 
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450 455 460 

Thr val Phe Thr Asp Gly Ser Ser Asn Gly Lys Ala Ala Tyr Thr Gly 
465 470 475 480 

Pro Lys Glu Arg val lie Lys Thr pro Tyr Gin ser Ala Gin Arg Ala 
485 490 495 

Glu Leu val Ala val lie Thr Val Leu Gin Asp Phe Asp Gin Pro lie 
500 505 510 

Asn lie lie Ser Asp Ser Ala Tyr Val Val Gin Ala Thr Arg Asp val 
515 520 525 

Glu Thr Ala Leu lie Lys Tyr ser Met Asp Asp Gin Leu Asn Gin Leu 
530 535 540 

Phe Asn Leu Leu Gin Gin Thr val Arg Lys Arg Asn Phe Pro Phe Tyr 
545 550 555 560 

lie Thr His lie Arg Ala His Thr Asn Leu Pro Gly Pro Leu Thr Lys 
565 570 575 

Ala Asn Glu Gin Ala Asp Leu Leu val ser ser Ala Leu lie Lys Ala 

580 585 590 

Gin Glu Leu His Ala Leu Thr His val Asn Ala Ala Gly Leu Lys Asn 
595 600 605 

Lys Phe Asp val Thr Trp Lys Gin Ala Lys Asp lie val Gin His Cys 
610 615 620 

Thr Gin Cys Gin Val Leu His Leu Pro Thr Gin Glu Ala Gly val Asn 
625 630 635 640 

Pro Arg Gly Leu Cys Pro Asn Ala Leu Trp Gin Met Asp val Thr His 
645 650 655 

Val Pro Ser Phe Gly Arg Leu Ser Tyr val His Val Thr val Asp Thr 
660 665 670 

Tyr Ser His Phe lie Trp Ala Thr cys Gin Thr Gly Glu Ser Thr ser 
675 680 685 

His val Lys Lys His Leu Leu Ser cys Phe Ala val Met Gly val Pro 
690 695 700 

Glu Lys lie Lys Thr Asp Asn Gly Pro Gly Tyr Cys Ser Lys Ala Phe 
705 710 715 720 

Gin Lys Phe Leu Ser Gin Trp Lys lie ser His Thr Thr Gly lie Pro 
725 730 735 

Tyr Asn ser Gin Gly Gin Ala lie val Glu Arg Thr Asn Arg Thr Leu 
740 745 750 

Lys Thr Gin Leu val Lys Gin Lys Glu Gly Gly Asp ser Lys Glu cys 
755 760 765 

Thr Thr Pro Gin Met Gin Leu Asn Leu Ala Leu Tyr Thr Leu Asn Phe 

770 775 780 

Leu Asn lie Tyr Arg Asn Gin Thr Thr Thr Ser Ala Glu Gin His Leu 
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790 795 

Lys Asn Ser Pro His Glu Gly Lys Leu 
805 810 



Asp Asn Lys Asn Lys Thr Trp Glu lie Gly Lys val 
820 825 



Arg Gly Phe 
835 

lie Pro Thr 
850 

Lys Lys ser 
865 

Asp Ser Gin 
Ala lie His 



Ala Asp Ala 
915 

Asn Pro Arg 
930 

Gly Lys ser 
945 



Ala Cys Val ser Pro Gly Glu Asn Gin 
840 



10587032_PP019482 . 007 
800 

lie Trp Trp Lys 
815 

lie Thr Trp Gly 
830 

Leu Pro val Trp 
845 



Arg His Leu Lys Phe Tyr Asn Glu Pro lie Arg Asp Ala 
855 860 



Thr Ser Ala Glu Thr Glu Thr ser Gin 
870 875 

Asp plu Gin Asn Gly Asp val Arg Arg 
885 890 

Gin Glu Gly Arg Ala Ala Asn Leu Gly 
900 905 

val ser Tyr Lys lie ser Arg Glu His 
920 



ser Ser Thr val 
880 

Thr Asp Glu Val 
895 

Thr Thr Lys Glu 
910 

Lys Gly Asp Thr 
925 



Glu Tyr Ala Ala Cys Ser Leu Asp Asp Cys lie Asn Gly 
935 940 

Pro Tyr Ala Cys Arg Ser Ser cys Ser 
950 955 



<210> 22 

<211> 2000 

<212> DNA 

<213> Human endogenous retrovirus, K family (herv-k) 



<400> 22 
atgaacccat 
gcaccgttga 
tccaccaaga 
acaaaatatc 
ttgatgattg 
tatacctact 
aatcctacag 
tgccctgcca 
cctcctattt 
gtagaagtac 
tcactcaggc 
agacctaaag 
ttagtttggg 
actattatag 
tcgtgtccaa 
aaacataagc 
tctaccccaa 
aggcttactg 
agagatcgta 
agttgcgtaa 
cagactataa 
caacaccgta 
cgaccgtggg 
aatagatcca 
acagctacgg 



cagagatgca 
ctcacaagat 
aggcagagcc 
tagagaacac 
tatcaatggt 
gggcctatgt 
aagtatatgt 
aacctgagga 
gcctagggag 
ctactgtcag 
cacgggtaaa 
ggaaaccttg 
aagaatgtgt 
attgggcacc 
gtgcacaagt 
ataaaaaatt 
gaccaaaaat 
tggcctcaca 
agccatttta 
agccccctta 
cctgtgaaaa 
ttctgctggt 
aggcctcgcc 
aaagattcat 
ctgctgtagc 



aagaaaagca 
gaacaaaatg 
gccaacttgg 
aaaggtgaca 
ggtaagtctc 
gcctttcccg 
taatgatagt 
agaagggatg 
agcaccagga 
tcccatctgt 
ttatttacaa 
ccccaaggaa 
ggccaatagt 
tcgaggtcaa 
gagtccagct 
gcagtctttc 
agtaagtcct 
ccacattaga 
tactattgac 
tatgctagtt 
ttgtagattg 
gagagcaaga 
atccgtccat 
ttttacttta 
aggagttgca 



cctccgcgga 
gtgacgtcag 
gcacaactaa 
caaaccccag 
cctatgcctg 
cccttaattc 
gtatgggtac 
atgataaata 
tgtttaatgc 
agattcactt 
gacttttctt 
attcccaaag 
gcggtgatat 
ttctaccaca 
gttgatagcg 
tacccttggg 
gtttctggtc 
atttggtctg 
ctgaattcca 
gtaggaaata 
cttacttgca 
gagggcgtgt 
attttgactg 
attgcagtga 
ttgcactctt 
Page 



gacggagaca 
aagaacagat 
agaagctgac 
agagtatgct 
caggagcagc 
gggcagtcac 
ctggccccat 
tttccattgg 
ctgcagtcca 
atcacatggt 
atcaaagatc 
aatcaaaaaa 
tacaaaacaa 
attgctcagg 
acttaacaga 
aatggggaga 
ctgaacatcc 
gaaatcaaac 
gtctaacagt 
tagttattaa 
ttgattcaac 
ggatccctgt 
aagtattaaa 
ttatgggatt 
ctgttcagtc 
28 



tcgcaatcga 
gaagttgcca 
gcagttagct 
gcttgcagcc 
tgcagctaac 
atggatggat 
agatgatcgc 
gtatcattat 
aaattggttg 
aagcgggatg 
attaaaattt 
tacagaagtt 
tgaattcgga 
acaaactcag 
aagtttagac 
aaaaggaatc 
agaattatgg 
tttagaaaca 
tcctttacaa 
accagactcc 
ttttaattgg 
gtccatggac 
aggtgtttta 
aattgcagtc 
agtaaacttt 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 



gttaatgatt 
aaattggcaa 
atgagcttag 
acaccccaaa 
ggaagagaag 
tcaaaagccc 
ctcgcaaatc 
ctcatattaa 
ctccgaagag 
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ggcaaaaaaa ttctacaaga ttgtggaatt cacaatctag tattgatcaa 
atcaaattaa tgatcttaga caaactgtca tttggatggg agacagactc 
aacatcgttt ccagttacaa tgtgactgga atacgtcaga tttttgtatt 
tttataatga gtctgagcat cactgggaca tggttagacg ccatctacag 
ataatctcac tttagacatt tccaaattaa aagaacaaat tttcgaagca 
atttaaattt ggtgccagga actgaggcaa ttgcaggagt tgctgatggc 
ttaaccctgt cacttgggtt aagaccattg gaagtactac gattataaat 
tccttgtgtg cctgttttgt ctgttgttag tctgcaggtg tacccaacag 
acagcgacca 



<210> 23 

<211> 2085 

<212> DNA 

<213> Human endogenous retrovirus, K family (HERV-K) 



<400> 23 
atgcaaagaa 
aagatgaaca 
gagccgccaa 
aacacaaagg 
atggtggtaa 
tatgtgcctt 
tatgttaatg 
gaggaagaag 
gggagagcac 
gtcagtccca 
gtaaattatt 
ccttgcccca 
tgtgtggcca 
gcacctcgag 
caagtgagtc 
aaattgcagt 
aaaatagtaa 
tcacaccaca 
ttttatacta 
ccttatatgc 
gaaaattgta 
ctggtgagag 
tcgccatccg 
ttcattttta 
gtagcaggag 
aaaaattcta 
attaatgatc 
cgtttccagt 
aatgagtctg 
ctcactttag 
aatttggtgc 
cctgtcactt 
gtgtgcctgt 
gaccatcgag 
gtggggaaaa 



aagcacctcc 
aaatggtgac 
cttgggcaca 
tgacacaaac 
gtctccctat 
tcccgccctt 
atagtgtatg 
ggatgatgat 
caggatgttt 
tctgtagatt 
tacaagactt 
aggaaattcc 
atagtgcggt 
gtcaattcta 
cagctgttga 
ctttctaccc 
gtcctgtttc 
ttagaatttg 
ttgacctgaa 
tagttgtagg 
gattgcttac 
caagagaggg 
tccatatttt 
ctttaattgc 
ttgcattgca 
caagattgtg 
ttagacaaac 
tacaatgtga 
agcatcactg 
acatttccaa 
caggaactga 
gggttaagac 
tttgtctgtt 
aacgggccat 
gcaagagaga 



gcggagacgg 
gtcagaagaa 
actaaagaag 
cccagagagt 
gcctgcagga 
aattcgggca 
ggtacctggc 
aaatatttcc 
aatgcctgca 
cacttatcac 
ttcttatcaa 
caaagaatca 
gatattacaa 
ccacaattgc 
tagcgactta 
ttgggaatgg 
tggtcctgaa 
gtctggaaat 
ttccagtcta 
aaatatagtt 
ttgcattgat 
cgtgtggatc 
gactgaagta 
agtgattatg 
ctcttctgtt 
gaattcacaa 
tgtcatttgg 
ctggaatacg 
ggacatggtt 
attaaaagaa 
ggcaattgca 
cattggaagt 
gttagtctgc 
gatgacgatg 
tcagattgtt 



agacatcgca 
cagatgaagt 
ctgacgcagt 
atgctgcttg 
gcagctgcag 
gtcacatgga 
cccatagatg 
attgggtatc 
gtccaaaatt 
atggtaagcg 
agatcattaa 
aaaaatacag 
aacaatgaat 
tcaggacaaa 
acagaaagtt 
ggagaaaaag 
catccagaat 
caaactttag 
acagttcctt 
attaaaccag 
tcaactttta 
cctgtgtcca 
ttaaaaggtg 
ggattaattg 
cagtcagtaa 
tctagtattg 
atgggagaca 
tcagattttt 
agacgccatc 
caaattttcg 
ggagttgctg 
actacgatta 
aggtgtaccc 
gcggttttgt 
actgtgtctg 



atcgagcacc 
tgccatccac 
tagctacaaa 
cagccttgat 
ctaactatac 
tggataatcc 
atcgctgccc 
attatcctcc 
ggttggtaga 
ggatgtcact 
aatttagacc 
aagttttagt 
tcggaactat 
ctcagtcgtg 
tagacaaaca 
gaatctctac 
tatggaggct 
aaacaagaga 
tacaaagttg 
actcccagac 
attggcaaca 
tggaccgacc 
ttttaaatag 
cagtcacagc 
actttgttaa 
atcaaaaatt 
gactcatgag 
gtattacacc 
tacagggaag 
aagcatcaaa 
atggcctcgc 
taaatctcat 
aacagctccg 
cgaaaagaaa 
tgtag 



gttgactcac 
caagaaggca 
atatctagag 
gattgtatca 
ctactgggcc 
tacagaagta 
tgccaaacct 
tatttgccta 
agtacctact 
caggccacgg 
taaagggaaa 
ttgggaagaa 
tatagattgg 
tcaaagtgca 
taagcataaa 
cccaagacca 
tactgtggcc 
tcgtaagcca 
cgtaaagccc 
tataacctgt 
ccgtattctg 
gtgggaggcc 
atccaaaaga 
tacggctgct 
tgattggcaa 
ggcaaatcaa 
cttagaacat 
ccaaatttat 
agaagataat 
agcccattta 
aaatcttaac 
attaatcctt 
aagagacagc 
agggggaaat 



<210> 24 

<211> 1665 

<212> DNA 

<213> Human endogenous retrovirus, K family (HERV-K) 



<400> 24 
gtcacatgga 
cccacagatg 
attgtgtatc 
gtccaaaatt 
atggtaagcg 



tggataatcc 
atcgctgccc 
gttatcctcc 
ggttggtaga 
ggatgtcact 



tatagaagta 
tgccaaacct 
tatttgccta 
agtacctact 
caggccacgg 



tatgttaatg 
gaggaagaag 
gggagagcac 
gtcagtccta 
gtaaattatt 
Page 



atagtgtatg 
ggatgatgat 
caggatgttt 
acagtagatt 
tacaagactt 
29 



ggtacctggc 
aaatatttcc 
aatgcctgca 
cacttatcac 
ttcttatcaa 
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agatcattaa aatttagacc taaagggaaa ccttgcccca aggaaattcc caaagaatca 360 

aaaaatacag aagttttagt ttgggaagaa tgtgtggcca atagtgcggt gatattacaa 420 

aacaatgaat tcggaactat tatagattgg gcacctcgag gtcaattcta ccacaattgc 480 

tcaggacaaa ctcagtcgtg tccaagtgca caagtgagtc cagctgttga tagcgactta 540 

acagaaagtc tagacaaaca taagcataaa aaattacagt ctttctaccc ttgggaatgg 600 

ggagaaaaag gaatctctac cccaagacca gaaataataa gtcctgtttc tggtcctgaa 660 

catccagaat tatggaggct ttggcctgac accacattag aatttggtct ggaaatcaaa 720 

ctttagaaac aagagatcgt aagccatttt atactatcga cctaaattcc agtctaacgg 780 

ttcctttaca aagttgcgta aagccctctt atatgctagt tgtaggaaat atagttatta 840 

aaccagactc ccaaactata acctgtgaaa attgtagatt gtttacttgc attgattcaa 900 

cttttaattg gcggcaccgt attctgctgg tgagagcaag agagggcgtg tggatctctg 960 

tgtccgtgga ctgaccgtgg gaggcctcgc catccatcca tattttgact gaagtattaa 1020 

aagacatttt aaatagatcc aaaagattca tttttacctt aattgcagtg attatgggat 1080 

taattgcagt cacagctacg gctgctgtgg caggagttgc attgcactct tctgttcagt 1140 

cggtaaactt tgttaatgat tggcaaaaga attctacaag attgtggaat tcacaatcta 1200 

gtattgatca aaaattggca aatcaaatta atgatcttag acaaactgtc atttggatgg 1260 

gagacagact catgagctta gaacattgtt tccagttaca gtgtgactgg aatacgtcag 1320 

atttttgtat tacaccccaa atttataatg agtctgagca tcactgggac atggttagac 1380 

gccatctaca gggaagagaa gataatctca ctttagacat ttccaaatta aaataacaaa 1440 

ttttcgaagc atcaaaagcc catttaaatt tgatgccagg aactgaggca attgcaggag 1500 

ttgctgatgg cctcgcaaat cttaaccctg tcacttgggt taagaccatc ggaagtacta 1560 

tgattataaa tctcatatta atccttgtgt gcctgttttg tctgttgtta gtctgcaggt 1620 

gtacccaaca gctccgaaga gacagcgacc atcgagaacg ggcca 1665 

<210> 25 
<211> 4086 
<212> DNA 

<213> Human endogenous retrovirus, K family (HERV-K) 
<400> 25 

atggggcctc tccaacccgg gttgccctct ccggccatga tcccaaaaga ttggccttta 60 

attataattg atctaaagga ttgctttttt accatccctc tggcagagca ggattgtgaa 120 

aaatttgcct ttactatacc agccataaat aataaagaac cagccaccag gtttcagtgg 180 

aaagtgttac ctcagggaat gcttaatagt ccaactattt gtcagacttt tgtaggtcga 240 

gctcttcaac cagtgagaga aaagttttca gactgttata ttattcatta tattgatgat 300 

attttatgtg ctgcagaaac gaaagataaa ttaattgact gttatacatt tctgcaagca 360 

gaggttgcca atgctggact ggcaatagca tccgataaga tccaaacctc tactcctttt 420 

cattatttag ggatgcagat agaaaataga aaaattaagc cacaaaaaat agaaataaga 480 

aaagacacat taaaaacact aaatgatttt caaaaattac taggagatat taattggatt 540 

cggccaactc taggcattcc tacttatgcc atgtcaaatt tgttctctat cttaagagga 600 

gactcagact taaatagtca aagaatatta accccagagg caacaaaaga aattaaatta 660 

gtggaagaaa aaattcagtc agcgcaaata aatagaatag atcccttagc cccactccaa 720 

cttttgattt ttgccactgc acattctcca acaggcatca ttattcaaaa tactgatctt 780 

gtggagtggt cattccttcc tcacagtaca gttaagactt ttacattgta cttggatcaa 840 

atagctacat taatcggtca gacaagatta cgaataacaa aattatgtgg aaatgaccca 900 

gacaaaatag ttgtcccttt aaccaaggaa caagttagac aagcctttat caattctggt 960 

gcatggcaga ttggtcttgc taattttgtg ggacttattg ataatcatta cccaaaaaca 1020 

aagatcttcc agttcttaaa attgactact tggattctac ctaaaattac cagacgtgaa 1080 

cctttagaaa atgctctaac agtatttact gatggttcca gcaatggaaa agcagcttac 1140 

acagggccga aagaacgagt aatcaaaact ccatatcaat cggctcaaag agacgagttg 1200 

gttgcagtca ttacagtgtt acaagatttt gaccaaccta tcaatattat atcagattct 1260 

gcatatgtag tacaggctac aagggatgtt gagacagctc taattaaata tagcatggat 1320 

gatcagttaa accagctatt caatttatta caacaaactg taagaaaaag aaatttccca 1380 

ttttatatta cttatattcg agcacacact aatttaccag ggcctttgac taaagcaaat 1440 

gaacaagctg acttactggt atcatctgca ctcataaaag cacaagaact tcatgctttg 1500 

actcatgtaa atgcagcagg attaaaaaac aaatttgatg tcacatggaa acaggcaaaa 1560 

gatattgtac aacattgcac ccagtgtcaa gtcttacacc tgcccactca agaggcagga 1620 

gttaatccca gaggtctgtg tcctaatgca ttatggcaaa tggatgtcac gcatgtacct 1680 

tcatttggaa gattatcata tgttcatgta acagttgata cttattcaca tttcatatgg 1740 

gcaacttgcc aaacaggaga aagtacttcc catgttaaaa aacatttatt gtcttgtttt 1800 

gctgtaatgg gagttccaga aaaaatcaaa actgacaatg gaccaggata ttgtagtaaa 1860 

gctttccaaa aattcttaag tcagtggaaa atttcacata caacaggaat tccttataat 1920 

tcccaaggac aggccatagt tgaaagaact aatagaacac tcaaaactca attagttaaa 1980 
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caaaaagaag ggggagacag taaggagtgt accactcctc agatgcaact taatctagca 2040 

ctctatactt taaatttttt aaacatttat agaaatcaga ctactacttc tgcagaacaa 2100 

catcttactg gtaaaaagaa cagcccacat gaaggaaaac taatttggtg gaaagataat 2160 

aaaaataaga catgggaaat agggaaggtg ataacgtggg ggagaggttt tgcttgtgtt 2220 

tcaccaggag aaaatcagct tcctgtttgg ttacccacta gacatttgaa gttctacaat 2280 

gaacccatcg gagatgcaaa gaaaagggcc tccacggaga tggtaacacc agtcacatgg 2340 

atggataatc ctatagaagt atatgttaat gatagtatat gggtacctgg ccccatagat 2400 

gatcgctgcc ctgccaaacc tgaggaagaa gggatgatga taaatatttc cattgggtat 2460 

cgttatcctc ctatttgcct agggagagca ccaggatgtt taatgcctgc agtccaaaat 2520 

tggttggtag aagtacctac tgtcagtccc atcagtagat tcacttatca catggtaagc 2580 

gggatgtcac tcaggccacg ggtaaattat ttacaagact tttcttatca aagatcatta 2640 

aaatttagac ctaaagggaa accttgcccc aaggaaattc ccaaagaatc aaaaaataca 2700 

gaagttttag tttgggaaga atgtgtggcc aatagtgcgg tgatattata aaacaatgaa 2760 

tttggaacta ttatagattg ggcacctcga ggtcaattct accacaattg ctcaggacaa 2820 

actcagtcgt gtccaagtgc acaagtgagt ccagctgttg atagcgactt aacagaaagt 2880 

ttagacaaac ataagcataa aaaattgcag tctttctacc cttgggaatg gggagaaaaa 2940 

ggaatctcta ccccaagacc aaaaatagta agtcctgttt ctggtcctga acatccagaa 3000 

ttatggaggc ttactgtggc ctcacaccac attagaattt ggtctggaaa tcaaacttta 3060 

gaaacaagag attgtaagcc attttatact gtcgacctaa attccagtct aacagttcct 3120 

ttacaaagtt gcgtaaagcc cccttatatg ctagttgtag gaaatatagt tattaaacca 3180 

gactcccaga ctataacctg tgaaaattgt agattgctta cttgcattga ttcaactttt 3240 

aattggcaac accgtattct gctggtgaga gcaagagagg gcgtgtggat ccctgtgtcc 3300 

atggaccgac cgtgggaggc ctcaccatcc gtccatattt tgactgaagt attaaaaggt 3360 

gttttaaata gatccaaaag attcattttt actttaattg cagtgattat gggattaatt 3420 

gcagtcacag ctacggctgc tgtagcagga gttgcattgc actcttctgt tcagtcagta 3480 

aactttgtta atgattggca aaagaattct acaagattgt ggaattcaca atctagtatt 3540 

gatcaaaaat tggcaaatca aattaatgat cttagacaaa ctgtcatttg gatgggagac 3600 

agactcatga gcttagaaca tcgtttccag ttacaatgtg actggaatac gtcagatttt 3660 

tgtattacac cccaaattta taatgagtct gagcatcact gggacatggt tagacgccat 3720 

ctacagggaa gagaagataa tctcacttta gacatttcca aattaaaaga acaaattttc 3780 

gaagcatcaa aagcccattt aaatttggtg ccaggaactg aggcaattgc aggagttgct 3840 

gatggcctcg caaatcttaa ccctgtcact tgggttaaga ccattggaag tacatcgatt 3900 

ataaatctca tattaatcct tgtgtgcctg ttttgtctgt tgttagtctg caggtgtacc 3960 

caacagctcc gaagagacag cgaccatcga gaacgggcca tgatgacgat ggcggttttg 4020 

tcgaaaagaa aagggggaaa tgtggggaaa agcaagagag atcaaattgt tactgtgtct 4080 

gtgtag 4086 

<210> 26 

<211> 694 

<212> PRT 

<213> Human endogenous retrovirus, K family (herv-k) 

<400> 26 

Met Gin Arg Lys Ala Pro Pro Arg Arg Arg Arg His Arg Asn Arg Ala 
1 5 10 . 15 

Pro Leu Thr His Lys Met Asn Lys Met Val Thr Ser Glu Glu Gin Met 
20 25 30 

Lys Leu Pro Ser Thr Lys Lys Ala Glu Pro Pro Thr Trp Ala Gin Leu 
35 40 45 

Lys Lys Leu Thr Gin Leu Ala Thr Lys Tyr Leu Glu Asn Thr Lys val 
50 55 60 

Thr Gin Thr Pro Glu Ser Met Leu Leu Ala Ala Leu Met lie Val ser 
65 70 75 80 

Met val val ser Leu Pro Met Pro Ala Gly Ala Ala Ala Ala Asn Tyr 
85 90 95 

Thr Tyr Trp Ala Tyr Val Pro Phe Pro Pro Leu He Arg Ala val Thr 
100 105 110 
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Trp Met Asp Asn Pro Thr Glu Val Tyr Val Asn Asp Ser Val Trp val 
115 120 125 

Pro Gly Pro lie Asp Asp Arg Cys Pro Ala Lys Pro Glu Glu Glu Gly 
130 135 140 

Met Met lie Asn lie ser lie Gly Tyr His Tyr Pro Pro lie Cys Leu 
145 150 155 160 

Gly Arg Ala Pro Gly Cys Leu Met Pro Ala val Gin Asn Trp Leu val 
165 170 175 

Glu val Pro Thr val Ser Pro lie Cys Arg Phe Thr Tyr His Met val 
180 185 190 

Ser Gly Met Ser Leu Arg Pro Arg val Asn Tyr Leu Gin Asp Phe Ser 
195 200 205 

Tyr Gin Arg ser Leu Lys Phe Arg Pro Lys Gly Lys Pro Cys Pro Lys 
210 215 220 

Glu lie Pro Lys Glu Ser Lys Asn Thr Glu val Leu val Trp Glu Glu 
225 230 235 240 

Cys val Ala Asn ser Ala val lie Leu Gin Asn Asn Glu Phe Gly Thr 
245 250 255 

lie lie Asp Trp Ala Pro Arg Gly Gin phe Tyr His Asn cys ser Gly 
260 265 270 

Gin Thr Gin ser Cys Gin Ser Ala Gin. Val Ser Pro Ala Val Asp Ser 
275 280 285 

Asp Leu Thr Glu Ser Leu Asp Lys His Lys His Lys Lys Leu Gin Ser 
290 295 300 

Phe Tyr Pro Trp Glu Trp Gly Glu Lys Gly lie Ser Thr Pro Arg Pro 

305 310 315 320 

Lys lie Val Ser Pro val Ser Gly Pro Glu His Pro Glu Leu Trp Arg 
325 330 335 

Leu Thr val Ala ser His His lie Arg lie Trp Ser Gly Asn Gin Thr 
340 345 350 

Leu Glu Thr Arg Asp Arg Lys Pro Phe Tyr Thr lie Asp Leu Asn Ser 

355 360 365 

Ser Leu Thr val Pro Leu Gin ser cys val Lys Pro Pro Tyr Met Leu 
370 375 380 

val val Gly Asn lie val lie Lys Pro Asp ser Gin Thr lie Thr cys 
385 390 395 400 

Glu Asn Cys Arg Leu Leu Thr Cys lie Asp ser Thr Phe Asn Trp Gin 
405 410 415 

His Arg lie Leu Leu val Arg Ala Arg Glu Gly Val Trp lie Pro val 
420 425 430 

Ser Met Asp Arg Pro Trp Glu Ala Ser Pro Ser val His lie Leu Thr 
435 440 445 
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Glu val Leu Lys Gly val Leu Asn Arg Ser Lys Arg Phe lie Phe Thr 
450 455 460 

Leu lie Ala val lie Met Gly Leu lie Ala Val Thr Ala Thr Ala Ala 
465 470 475 480 

Val Ala Gly val Ala Leu His ser ser val Gin Ser Val Asn Phe val 
485 490 495 

Asn Asp Trp Gin Lys Asn Ser Thr Arg Leu Trp Asn Ser Gin Ser ser 
500 • 505 510 

lie Asp Gin Lys Leu Ala Asn Gin lie Asn Asp Leu Arg Gin Thr val 
515 520 525 

lie Trp Met Gly Asp Arg Leu Met Ser Leu Glu His Arg Phe Gin Leu 
530 535 540 

Gin Cys Asp Trp Asn Thr ser Asp Phe Cys lie Thr Pro Gin lie Tyr 
545 550 555 560 

Asn Glu ser Glu His His Trp Asp Met val Arg Arg His Leu Gin Gly 
565 570 575 

Arg Glu Asp Asn Leu Thr Leu Asp lie Ser Lys Leu Lys Glu Gin lie 
580 585 590 

Phe Glu Ala Ser Lys Ala His Leu Asn Leu val Pro Gly Thr Glu Ala 
595 600 605 

lie Ala Gly Val Ala Asp Gly Leu Ala Asn Leu Asn Pro Val Thr Trp 
610 615 620 

val Lys Thr lie Gly Ser Thr Thr lie lie Asn Leu lie Leu lie Leu 
625 630 635 640 

Val Cys Leu Phe Cys Leu Leu Leu val Cys Arg Cys Thr Gin Gin Leu 
645 650 655 

Arg Arg Asp Ser Asp His Arg Glu Arg Ala Met Met Thr Met Ala val 
660 665 670 

Leu Ser Lys Arg Lys Gly Gly Asn val Gly Lys Ser Lys Arg Asp Gin 
675 680 685 

lie Val Thr Val Ser val 
690 

<210> 27 

<211> 1361 

<212> PRT 

<213> Human endogenous retrovirus, K family (herv-k) 
<220> 

<221> SITE 

<222> 917 

<223> xaa is any amino acid 

<400> 27 

Met Gly Pro Leu Gin Pro Gly Leu Pro Ser Pro Ala Met lie Pro Lys 
1 5 10 15 
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Asp Trp Pro Leu lie lie lie Asp Leu Lys Asp Cys Phe Phe Thr lie 
20 25 30 

Pro Leu Ala Glu Gin Asp Cys Glu Lys Phe Ala Phe Thr lie Pro Ala 
35 40 45 

lie Asn Asn Lys Glu Pro Ala Thr Arg Phe Gin Trp Lys Val Leu Pro 
SO 55 60 

Gin Gly Met Leu Asn Ser Pro Thr lie Cys Gin Thr Phe val Gly Arg 
65 70 75 80 

Ala Leu Gin Pro Val Arg Glu Lys Phe Ser Asp Cys Tyr lie lie His 
85 90 95 

Tyr lie Asp Asp lie Leu Cys Ala Ala Glu Thr Lys Asp Lys Leu lie 
100 105 110 

Asp Cys Tyr Thr Phe Leu Gin Ala Glu val Ala Asn Ala Gly Leu Ala 
115 120 125 

lie Ala Ser Asp Lys lie Gin Thr Ser Thr Pro Phe His Tyr Leu Gly 

130 135 140 

Met Gin lie Glu Asn Arg Lys lie Lys Pro Gin Lys lie Glu lie Ar 
145 150 155 16 

Lys Asp Thr Leu Lys Thr Leu Asn Asp Phe Gin Lys Leu Leu Gly Asp 
165 170 175 

He Asn Trp lie Arg Pro Thr Leu Gly He Pro Thr Tyr Ala Met Ser 

180 185 190 

Asn Leu Phe Ser lie Leu Arg Gly Asp Ser Asp Leu Asn ser Gin Arg 
195 200 205 

lie Leu Thr Pro Glu Ala Thr Lys Glu lie Lys Leu val Glu Glu Lys 
210 215 220 

lie Gin Ser Ala Gin lie Asn Arg lie Asp Pro Leu Ala Pro Leu Gin 
225 230 235 240 

Leu Leu lie Phe Ala Thr Ala His Ser Pro Thr Gly lie lie lie Gin 
245 250 255 

Asn Thr Asp Leu val Glu Trp ser Phe Leu Pro His Ser Thr Val Lys 
260 265 270 

Thr Phe Thr Leu Tyr Leu Asp Gin lie Ala Thr Leu lie Gly Gin Thr 

275 280 285 

Arg Leu Arg He Thr Lys Leu Cys Gly Asn Asp Pro Asp Lys lie val 
290 295 300 

Val Pro Leu Thr Lys Glu Gin val Arg Gin Ala Phe lie Asn ser Gly 
305 310 315 320 

Ala Trp Gin lie Gly Leu Ala Asn Phe val Gly Leu lie Asp Asn His 

325 330 335 

Tyr Pro Lys Thr Lys lie Phe Gin Phe Leu Lys Leu Thr Thr Trp lie 
340 345 350 
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Leu Pro Lys lie Thr Arg Arg Glu Pro Leu Glu Asn Ala Leu Thr val 
355 360 365 

Phe Thr Asp Gly Ser ser Asn Gly Lys Ala Ala Tyr Thr Gly Pro Lys 
370 375 380 

Glu Arg Val lie Lys Thr Pro Tyr Gin Ser Ala Gin Arg Asp Glu Leu 
385 390 395 400 

val Ala val lie Thr val Leu Gin Asp Phe Asp Gin Pro lie Asn lie 
405 410 415 

He Ser Asp ser Ala Tyr val val Gin Ala Thr Arg Asp val Glu Thr 
420 425 430 

Ala Leu lie Lys Tyr ser Met Asp Asp Gin Leu Asn Gin Leu Phe Asn 
435 440 445 

Leu Leu Gin Gin Thr val Arg Lys Arg Asn Phe Pro Phe Tyr lie Thr 
450 455 460 

Tyr lie Arg Ala His Thr Asn Leu Pro Gly Pro Leu Thr Lys Ala Asn 
465 470 475 480 

Glu Gin Ala Asp Leu Leu val Ser Ser Ala Leu lie Lys Ala Gin Glu . 

485 490 495 

Leu His Ala Leu Thr His Val Asn Ala Ala Gly Leu Lys Asn Lys Phe 
500 505 510 

Asp val Thr Trp Lys Gin Ala Lys Asp lie val Gin His cys Thr Gin 
515 520 525 

Cys Gin val Leu His Leu Pro Thr Gin Glu Ala Gly Val Asn Pro Arg 
530 535 540 

Gly Leu cys Pro Asn Ala Leu Trp Gin Met Asp val Thr His val Pro 
545 550 555 560 

Ser Phe Gly Arg Leu ser Tyr val His val Thr val Asp Thr Tyr ser 

565 570 575 

His Phe lie Trp Ala Thr cys Gin Thr Gly Glu Ser Thr ser His Val 
580 585 590 

Lys Lys His Leu Leu Ser Cys Phe Ala val Met Gly val Pro Glu Lys 
595 600 605 

lie Lys Thr Asp Asn Gly Pro Gly Tyr Cys Ser Lys Ala Phe Gin Lys 
610 615 620 

Phe Leu ser Gin Trp Lys lie Ser His Thr Thr Gly lie Pro Tyr Asn 
625 630 635 640 

Ser Gin Gly Gin Ala lie val Glu Arg Thr Asn Arg Thr Leu Lys Thr 
645 650 655 

Gin Leu val Lys Gin Lys Glu Gly Gly Asp Ser Lys Glu cys Thr Thr 
660 665 670 

Pro Gin Met Gin Leu Asn Leu Ala Leu Tyr Thr Leu Asn Phe Leu Asn 
675 680 685 

Page 35 



Substitute sequence Listing_USSN 10587032_PP019482.007 
lie Tyr Arg Asn Gin Thr Thr Thr Ser Ala Glu Gin His Leu Thr Gly 
690 695 700 

Lys Lys Asn Ser Pro His Glu Gly Lys Leu lie Trp Trp Lys Asp Asn 
705 710 715 720 

Lys Asn Lys Thr Trp Glu lie Gly Lys val lie Thr Trp Gly Arg Gly 
725 730 735 

Phe Ala Cys val Ser Pro Gly Glu Asn Gin Leu Pro val Trp Leu Pro 
740 745 750 

Thr Arg His Leu Lys Phe Tyr Asn Glu Pro lie Gly Asp Ala Lys Lys 

755 760 765 

Arg Ala Ser Thr Glu Met val Thr Pro val Thr Trp Met Asp Asn Pro 
770 775 780 

lie Glu val Tyr val Asn Asp ser lie Trp val Pro Gly Pro lie Asp 
785 790 795 800 

Asp Arg cys Pro Ala Lys Pro Glu Glu Glu Gly Met Met lie Asn lie 

805 810 815 

Ser lie Gly Tyr Arg Tyr Pro pro lie Cys Leu Gly Arg Ala Pro Gly 
820 825 830 

Cys Leu Met Pro Ala val Gin Asn Trp Leu val Glu val Pro Thr val 
835 840 845 

Ser Pro lie Ser Arg Phe Thr Tyr His Met val Ser Gly Met Ser Leu 

850 855 860 

Arg Pro Arg val Asn Tyr Leu Gin Asp Phe Ser Tyr Gin Arg Ser Leu 
865 870 875 880 

Lys Phe Arg Pro Lys Gly Lys Pro Cys Pro Lys Glu lie Pro Lys Glu 
885 890 895 

Ser Lys Asn Thr Glu val Leu val Trp Glu Glu Cys val Ala Asn ser 
900 905 910 

Ala val lie Leu xaa Asn Asn Glu Phe Gly Thr lie lie Asp Trp Ala 
915 920 925 

Pro Arg Gly Gin Phe Tyr His Asn Cys Ser Gly Gin Thr Gin Ser Cys 
930 935 940 

Pro Ser Ala Gin val ser pro Ala val Asp Ser Asp Leu Thr Glu ser 
945 950 955 960 

Leu Asp Lys His Lys His Lys Lys Leu Gin ser Phe Tyr Pro Trp Glu 
965 970 975 

Trp Gly Glu Lys Gly lie Ser Thr Pro Arg Pro Lys lie val ser Pro 
980 985 990 

Val ser Gly Pro Glu His Pro Glu Leu Trp Arg Leu Thr val Ala Ser 
995 1000 1005 

His His lie Arg lie Trp ser Gly Asn Gin Thr Leu Glu Thr Arg Asp 
1010 1015 1020 
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Cys Lys Pro Phe Tyr Thr val Asp Leu Asn Ser ser Leu Thr Val Pro 
1025 1030 1035 1040 

Leu Gin ser cys val Lys Pro Pro Tyr Met Leu val val Gly Asn lie 
1045 1050 1055 

Val lie Lys Pro Asp Ser Gin Thr lie Thr Cys Glu Asn Cys Arg Leu 
1060 1065 1070 

Leu Thr Cys lie Asp Ser Thr Phe Asn Trp Gin His Arg lie Leu Leu 
1075 1080 1085 

val Arg Ala Arg Glu Gly Val Trp lie Pro val Ser Met Asp Arg Pro 
1090 1095 1100 

Trp Glu Ala Ser Pro Ser val His lie Leu Thr Glu val Leu Lys Gly 
1105 1110 1115 1120 

Val Leu Asn Arg Ser Lys Arg Phe lie Phe Thr Leu lie Ala val lie 
1125 1130 1135 

Met Gly Leu He Ala val Thr Ala Thr Ala Ala val Ala Gly val Ala 
1140 1145 1150 

Leu His ser Ser val Gin ser val Asn Phe val Asn Asp Trp Gin Lys 
1155 1160 1165 

Asn ser Thr Arg Leu Trp Asn ser Gin ser ser lie Asp Gin Lys Leu 
1170 1175 1180 

Ala Asn Gin lie Asn Asp Leu Arg Gin Thr val lie Trp Met Gly Asp 
1185 1190 1195 1200 

Arg Leu Met Ser Leu Glu His Arg Phe Gin Leu Gin Cys Asp Trp Asn 
1205 1210 1215 

Thr Ser Asp Phe Cys lie Thr Pro Gin lie Tyr Asn Glu Ser Glu His 
1220 1225 1230 

His Trp Asp Met Val Arg Arg His Leu Gin Gly Arg Glu Asp Asn Leu 
1235 1240 1245 

Thr Leu Asp lie Ser Lys Leu Lys Glu Gin lie Phe Glu Ala Ser Lys 
1250 1255 1260 

Ala His Leu Asn Leu val Pro Gly Thr Glu Ala lie Ala Gly val Ala 
1265 1270 1275 1280 

Asp Gly Leu Ala Asn Leu Asn Pro val Thr Trp val Lys Thr lie Gly 

1285 1290 1295 

Ser Thr Ser lie lie Asn Leu lie Leu lie Leu Val Cys Leu Phe Cys 
1300 1305 1310 

Leu Leu Leu Val Cys Arg Cys Thr Gin Gin Leu Arg Arg Asp Ser Asp 
1315 1320 1325 

His Arg Glu Arg Ala Met Met Thr Met Ala val Leu Ser Lys Arg Lys 
1330 1335 1340 

Gly Gly Asn val Gly Lys Ser Lys Arg Asp Gin lie val Thr val Ser 
1345 1350 1355 1360 
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val 



<210> 
<211> 
<212> 
<213> 
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Pro Ala val Asp Ser Asp Leu Thr Glu ser Leu Asp Lys His Lys His 
290 295 300 

Lys Lys Leu Gin ser Phe Tyr Pro Trp Glu Trp Gly Glu Lys Gly lie 
305 310 315 320 

Ser Thr Pro Arg Pro Lys lie Val Ser Pro val Ser Gly Pro Glu His 
325 330 335 

Pro Glu Leu Trp Arg Leu Thr val Ala ser His His lie Arg lie Trp 
340 345 350 

Ser Gly Asn Gin Thr Leu Glu Thr Arg Asp Arg Lys Pro Phe Tyr Thr 

355 360 365 

lie Asp Leu Asn Ser Ser Leu Thr val Pro Leu Gin Ser Cys val Lys 
370 375 380 

Pro Pro Tyr Met Leu val Val Gly Asn lie val lie Lys Pro Asp Ser 
385 390 395 400 

Gin Thr lie Thr Cys Glu Asn Cys Arg Leu Leu Thr Cys lie Asp Ser 
405 410 415 

Thr Phe Asn Trp Gin His Arg lie Leu Leu val Arg Ala Arg Glu Gly 
420 425 430 

Val Trp lie Pro val Ser Met Asp Arg Pro Trp Glu Ala Ser Pro Ser 
435 440 445 

Val His lie Leu Thr Glu val Leu Lys Gly val Leu Asn Arg ser Lys 
450 455 460 

Arg Phe lie Phe Thr Leu lie Ala val lie Met Gly Leu lie Ala Val 
465 470 475 480 

Thr Ala Thr Ala Ala val Ala Gly val Ala Leu His ser ser val Gin 
485 490 495 

ser val Asn Phe val Asn Asp Trp Gin Lys Asn ser Thr Arg Leu Trp 

500 505 510 

Asn Ser Gin ser ser lie Asp Gin Lys Leu Ala Asn Gin lie Asn Asp 
515 520 525 

Leu Arg Gin Thr Val lie Trp Met Gly Asp Arg Leu Met Ser Leu Glu 
530 535 540 

His Arg Phe Gin Leu Gin Cys Asp Trp Asn Thr ser Asp Phe Cys lie 

545 550 555 560 

Thr Pro Gin He Tyr Asn Glu Ser Glu His His Trp Asp Met val Arg 
565 570 575 

Arg His Leu Gin Gly Arg Glu Asp Asn Leu Thr Leu Asp l1e Ser Lys 
580 585 590 

Leu Lys Glu Gin lie Phe Glu Ala ser Lys Ala His Leu Asn Leu val 
595 600 605 

Pro Gly Thr Glu Ala lie Ala Gly Val Ala Asp Gly Leu Ala Asn Leu 
610 615 620 
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Asn Pro val Thr Trp val Lys Thr lie Gly Ser Thr Thr lie lie Asn 
625 630 635 640 

Leu lie Leu lie Leu val Cys Leu Phe cys Leu Leu Leu val Cys Arg 
645 650 655 

Cys Thr Gin Gin Leu Arg Arg Asp Ser Asp His Arg Glu Arg Ala Met 
660 665 670 

Met Thr Met Ala val Leu Ser Lys Arg Lys Gly Gly Asn val Gly Lys 
675 680 685 

ser Lys Arg Asp Gin lie val Thr val ser val 
690 695 

<210> 29 
<211> 294 
<212> DNA 

<213> Human endogenous retrovirus, K family (HERV-K) 
<400> 29 

agttctacaa tgaacccatc agagatgcaa agaaaagcac ctccgcggag acggagacat 60 
cgcaatcgag caccgttgac tcacaagatg aacaaaatgg tgacgtcaga agaacagatg 120 
aagttgccat ccaccaagaa ggcagagccg ccaacttggg cacaactaaa gaagctgacg 180 
cagttagcta caaaatatct agagaacaca aaggtgacac aaaccccaga gagtatgctg 240 
cttgcagcct tgatgattgt atcaatggtg gtaagtctcc ctatgcctgc agga 294 

<210> 30 
<211> 57 
<212> DNA 

<213> Human endogenous retrovirus, K family (herv-k) 

<400> 30 

tctgcaggtg tacccaacag ctccgaagag acagcgacca tcgagaacgg gccatga 57 

<210> 31 
<211> 105 
<212> PRT 

<213> Human endogenous retrovirus, K family (HERV-K) 
<400> 31 

Met Asn Pro Ser Glu Met Gin Arg Lys Ala Pro Pro Arg Arg Arg Arg 

His Arg Asn Arg Ala Pro Leu Thr His Lys Met Asn Lys Met Val Thr 
20 25 30 

ser Glu Glu Gin Met Lys Leu Pro Ser Thr Lys Lys Ala Gly Pro Pro 
35 40 45 

Thr Trp Ala Gin Leu Lys Lys Leu Thr Gin Leu Ala Thr Lys Tyr Leu 
50 55 60 

Glu Asn Thr Lys Val Thr Gin Thr Pro Glu Ser Met Leu Leu Ala Ala 
65 70 75 80 

Leu Met lie val ser Met val Ser Ala Gly val Pro Asn ser Ser Glu 
85 90 95 

Glu Thr Ala Thr lie Glu Asn Gly Pro 
100 105 

<210> 32 
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<211> 86 
<212> PRT 

<213> Human endogenous retrovirus, K family (herv-k) 
<400> 32 

Met Asn Pro Ser Glu Met Gin Arg Lys Ala Pro Pro Arg Arg Arg Arg 
15 10 15 

His Arg Asn Arg Ala Pro Leu Thr His Lys Met Asn Lys Met Val Thr 
20 25 30 

Ser Glu Glu Gin Met Lys Leu Pro Ser Thr Lys Lys Ala Glu Pro Pro 
35 40 45 

Thr Trp Ala Gin Leu Lys Lys Leu Thr Gin Leu Ala Thr Lys Tyr Leu 
50 55 60 

Glu Asn Thr Lys Ser Ala Gly val Pro Asn Ser Ser Glu Glu Thr Ala 
65 70 75 80 

Thr lie Glu Asn Gly Pro 
85 

<210> 33 
<211> 74 
<212> PRT 

<213> Human endogenous retrovirus, K family (herv-k) 
<400> 33 

Met Asn Pro Ser Glu Met Gin Arg Lys Gly Pro Pro Gin Arg Cys Leu 
15 10 15 

Gin Val Tyr Pro Thr Ala Pro Lys Arg Gin Arg Pro Ser Arg Thr Gly 
20 25 30 

His Asp Asp Asp Gly Gly Phe val Glu Lys Lys Arg Gly Lys cys Gly 
35 40 45 

Glu Lys Gin Glu Arg Ser Asp Cys Tyr Cys Val Cys val Glu Arg Ser 
50 55 60 

Arg His Arg Arg Leu His Phe Val Leu Tyr 
65 70 

<210> 34 
<211> 79 
<212> PRT 

<213> Human endogenous retrovirus, K family (HERV-K) 
<400> 34 

Met Asn Ser Leu Glu Met Gin Arg Lys val Trp Arg Trp Arg His Pro 
15 10 15 

Asn Arg Leu Ala Ser Leu Gin val Tyr Pro Ala Ala Pro Lys Arg Gin 
20 25 30 

Gin Pro Ala Arg Met Gly His Ser Asp Asp Gly Gly Phe Val Lys Lys 
35 40 45 

Lys Arg Gly Gly Tyr Val Arg Lys Arg Glu lie Arg Leu Ser Leu Cys 
50 55 60 

Leu Cys Arg Lys Gly Arg His Lys Lys Leu His Phe val Leu Tyr 

Page 41 



substitute sequence Listing_USSN 10587032_PP019482.007 
65 70 75 

<210> 35 

<211> 129 

<212> PRT 

<213> Human endogenous retrovirus, K family (HERV-K) 

<400> 35 

Met Asn Pro Ser Glu Met Gin Arg Lys Ala Pro Pro Arg Arg Arg Arg 
1 5 10 15 

His Arg Asn Arg Ala Pro Leu Thr His Lys Met Asn Lys Met val Thr 
20 25 30 

Ser Glu Glu Gin Met Lys Leu Pro Ser Thr Lys Lys Ala Glu Pro Pro 
35 40 45 

Thr Trp Ala Gin Leu Lys Lys Leu Thr Gin Leu Ala Thr Lys Tyr Leu 
50 55 60 

Glu Asn Thr Lys Val lie Leu Gin val Tyr Pro Thr Ala Pro Lys Arg 
65 70 75 80 

Gin Arg Pro Ser Arg Thr Gly His Asp Asp Asp Gly Gly Phe val Glu 
85 90 95 

Lys Lys Arg Gly Lys Cys Gly Glu Lys Gin Glu Arg ser Asp Cys Tyr 
100 105 110 

Cys val cys Val Glu Arg ser Arg His Arg Arg Leu His Phe val Leu 
115 120 125 

Tyr 

<210> 36 
<211> 125 
<212> PRT 

<213> Human endogenous retrovirus, K family (herv-k) 
<400> 36 

Met Asn Pro Ser Glu Met Gin Arg Lys Ala Pro Pro Arg Arg Arg Arg 
1 5 10 » 5* ^^i. » 

His Arg Asn Arg Ala Pro Leu Thr His Lys Met Asn Lys Met val Thr 
20 25 30 

Ser Glu Glu Gin Met Lys Leu Pro Ser Thr Lys Lys Ala Glu Pro Pro 
35 40 45 

Thr Trp Ala Gin Leu Lys Lys Leu Thr Gin Leu Ala Thr Lys Tyr Leu 
50 55 60 

Glu Asn Thr Lys val Tyr Pro Thr Ala Pro Lys Arg Gin Arg Pro Ser 
65 70 75 80 

Arg Thr Gly His Asp Asp Asp Gly Gly Phe Val Glu Lys Lys Arg Gly 
85 90 95 

Lys Cys Gly Glu Lys Gin Glu Arg Ser Asp Cys Tyr cys val cys val 
100 105 110 

Glu Arg Ser Arg His Arg Arg Leu His Phe val Leu Tyr 

Page 42 



substitute Sequence Listing_USSN 10587032_PP019482.007 
115 120 125 

<210> 37 

<211> 144 

<212> PRT 

<213> Human endogenous retrovirus, K family (HERV-K) 

<400> 37 

Met Asn Pro Ser Glu Met Gln Arg Lys Ala Pro Pro Arg Arg Arg Arg 
1 5 10 15 

His Arg Asn Arg Ala Pro Leu Thr His Lys Met Asn Lys Met Val Thr 
20 25 30 

Ser Glu Glu Gin Met Lys Leu Pro Ser Thr Lys Lys Ala Glu Pro Pro 
35 40 45 

Thr Trp Ala Gin Leu Lys Lys Leu Thr Gin Leu Ala Thr Lys Tyr Leu 
50 55 60 

Glu Asn Thr Lys Val Thr Gin Thr Pro Glu Ser Met Leu Leu Ala Ala 
65 70 75 80 

Leu Met lie val ser Met val val Tyr Pro Thr Ala Pro Lys Arg Gin 
85 90 95 

Arg Pro Ser Arg Thr Gly His Asp Asp Asp Gly Gly Phe val Glu Lys 
100 105 110 

Lys Arg Gly Lys Cys Gly Glu Lys Gin Glu Arg ser Asp Cys Tyr Cys 
115 120 125 

Val cys val Glu Arg Ser Arg His Arg Arg Leu His Phe val Leu Tyr 
130 135 140 

<210> 38 
<211> 74 
<212> PRT 

<213> Human endogenous retrovirus, K family (herv-k) 
<400> 38 

Met Asn Pro Ser Glu Met Gin Arg Lys Gly Pro Pro Gin Arg Cys Leu 
15 10 15 

Gin val Tyr Pro Thr Ala Pro Lys Arg Gin Arg Pro Ser Arg Thr Gly 
20 25 30 

His Asp Asp Asp Gly Gly Phe val Glu Lys Lys Arg Gly Lys Cys Gly 
35 40 45 

Glu Lys Gin Glu Arg Ser Asp Cys Tyr Cys Val Cys val Glu Arg Ser 
50 55 60 

Arg His Arg Arg Leu His Phe val Leu Tyr 
65 70 

<210> 39 
<211> 74 
<212> PRT 

<213> Human endogenous retrovirus, K family (HERV-K) 
<400> 39 
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Met Asn Pro Ser Glu Met Gin Arg Lys Gly Pro Pro Gin Arg Cys Leu 
15 10 15 

Gin val Tyr Pro Thr Ala Pro Lys Arg Gin Arg Pro Ser Arg Thr Gly 
20 25 30 

His Asp Asp Asp Gly Gly Phe val Glu Lys Lys Arg Gly Lys Cys Gly 
35 40 45 

Glu Lys Gin Glu Arg Ser Asp Cys Tyr Cys Val Cys val Glu Arg Ser 
50 55 60 

Arg His Arg Arg Leu His Phe val Leu Tyr 
65 70 

<210> 40 
<211> 44 
<212> PRT 

<213> Human endogenous retrovirus, K family (herv-k) 
<400> 40 

Met Glu Tyr Lys Asn Arg His Leu Lys Phe Tyr Asn Glu Pro lie Gly 
15 10 15 

Asp Ala Lys Lys Arg Ala ser Thr Glu Met ser Ala Gly val Pro Asn 
20 25 30 

Ser Ser Glu Glu Thr Ala Thr lie Glu Asn Gly Pro 
35 40 

<210> 41 
<211> 74 
<212> PRT 

<213> Human endogenous retrovirus, K family (HERV-K) 
<400> 41 

Met Asn Pro Ser Glu Met Gin Arg Lys Gly Pro Pro Gin Arg Cys Leu 
15 10 15 

Gin Val Tyr Pro Thr Ala Pro Lys Arg Gin Arg Pro Ser Arg Thr Gly 
20 25 30 

His Asp Asp Asp Gly Gly Phe val Glu Lys Lys Arg Gly Lys Cys Gly 
35 40 45 

Glu Lys Gin Glu Arg Ser Asp Cys Tyr Cys val cys Val Glu Arg Ser 
50 55 60 

Arg His Arg Arg Leu His Phe val Leu Tyr 
65 70 

<210> 42 
<211> 86 
<212> PRT 

<213> Human endogenous retrovirus, K family (herv-K) 
<400> 42 

Met Asn Pro Ser Glu Met Gin Arg Lys Ala Pro Pro Arg Arg Arg Arg 

His Arg Asn Arg Ala Pro Leu Thr His Lys Met Asn Lys Met val Thr 
20 25 30 
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Ser Glu Glu Gin Met Lys Leu Pro ser Thr Lys Lys Ala Glu Pro Pro 
35 40 45 

Thr Trp Ala Gin Leu Lys Lys Leu Thr Gin Leu Ala Thr Lys Tyr Leu 
50 55 60 

Glu Asn Thr Lys ser Ala Gly val Pro Asn Ser ser Glu Glu Thr Ala 
65 70 75 80 

Thr lie Glu Asn Gly Pro 
85 

<210> 43 

<211> 105 

<212> PRT 

<213> Human endogenous retrovirus, K family (HERV-K) 

<400> 43 

Met Asn Pro ser Glu Met Gin Arg Lys Ala Pro Pro Arg Arg Arg Arg 
1 5 10 a a ^^a y 

His Arg Asn Arg Ala Pro Leu Thr His Lys Met Asn Lys Met Val Thr 
20 25 30 

Ser Glu Glu Gin Met Lys Leu Pro Ser Thr Lys Lys Ala Glu Pro Pro 
35 40 45 

Thr Trp Ala Gin Leu Lys Lys Leu Thr Gin Leu Ala Thr Lys Tyr Leu 
50 55 60 

Glu Asn Thr Lys val Thr Gin Thr Pro Glu Ser Met Leu Leu Ala Ala 
65 70 75 80 

Leu Met lie val Ser Met val Ser Ala Gly Val Pro Asn Ser ser Glu 
85 90 95 

Glu Thr Ala Thr lie Glu Asn Gly Pro 
100 105 

<210> 44 
<211> 127 
<212> PRT 

<213> Human endogenous retrovirus, K family (HERV-K) 
<400> 44 

Met Val Thr Pro val Thr Trp Met Asp Asn Pro lie Glu val Tyr val 
15 10 15 

Asn Asp ser val Trp val Pro Gly Pro Thr Asp Asp Arg Cys Pro Ala 

20 25 30 

Lys Pro Glu Glu Glu Gly Met Met lie Asn lie Ser lie Val Tyr Arg 
35 40 45 

Tyr Pro Pro He Cys Leu Gly Arg Ala Pro Gly Cys Leu Met Pro Ala 
50 55 60 

val Gin Asn cys Leu Gin val Tyr Pro Thr Ala Pro Lys Arg Gin Arg 
65 70 75 80 

Pro Ser Arg Thr Gly His Asp Asp Asp Gly Gly Phe val Glu Lys Lys 
85 90 95 
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Arg Gly Lys cys Gly Glu Lys Gin Glu Arg Ser Asp Cys Tyr cys val 
100 105 110 

Cys val Glu Arg Ser Arg His Arg Arg Leu His Phe val Leu Tyr 
115 120 125 

<210> 45 

<211> 105 

<212> PRT 

<213> Human endogenous retrovirus, K family (HERV-K) 

<400> 45 

Met val Thr Pro val Thr Trp Met Asp Asn Pro lie Glu val Tyr val 

1 5 10 • 15 

Asn Asp Ser Glu Trp val Pro Gly Pro Thr Asp Asp Arg cys Pro Ala 
20 25 30 

Lys Pro Glu Glu Glu Gly Met Met lie Asn lie Ser lie Gly Leu Gin 
35 40 45 

Val Tyr Pro Thr Ala Pro Lys Arg Gin Arg Pro Ser Arg Thr Gly His 

50 55 60 

Asp Asp Asp Gly Gly Phe val Glu Lys Lys Arg Gly Lys Cys Gly Glu 
65 70 75 80 

Lys Gin Glu Arg Ser Asp Cys Tyr Cys Val Cys Val Glu Arg ser Arg 
85 90 95 

His Arg Arg Leu His Phe Val Met Cys 
100 105 

<210> 46 
<211> 79 
<212> PRT 

<213> Human endogenous retrovirus, K family (herv-k) 
<400> 46 

Met Asn Ser Leu Glu Met Gin Arg Lys val Trp Arg Trp Arg His Pro 

15 10 15 

Asn Arg Leu Ala Ser Leu Gin val Tyr Pro Ala Ala Pro Lys Arg Gin 
20 25 30 

Gin Pro Ala Arg Met Gly His Ser Asp Asp Gly Gly Phe Val Lys Lys 
35 40 45 

Lys Arg Gly Gly Tyr val Arg Lys Arg Glu lie Arg Leu Ser Leu Cys 
50 55 60 

Leu Cys Arg Lys Gly Arg His Lys Lys Leu His Phe Asp Leu Tyr 
65 70 75 

<210> 47 
<211> 214 
<212> PRT 

<213> Human endogenous retrovirus, K family (herv-k) 
<400> 47 

Met Asn Ser Leu Glu Met Gin Arg Lys Ala Pro Pro Arg Arg Arg Arg 
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His Arg Asn Arg Ala Pro Leu Thr His Lys Met Asn Lys Met val Thr 
20 25 30 

Ser Glu Glu Gin Met Lys Leu Ser Ser Thr Lys Lys Ala Glu Pro Pro 
35 40 45 

Thr Trp Ala Gin Leu Lys Lys Leu Thr Gin Leu Ala Thr Lys Tyr Leu 
50 55 60 

Glu Asn Thr Lys val Thr Gin Thr Pro Glu Ser Met Leu Leu Ala Ala 
65 70 75 80 

Leu Met lie val Ser Met val Val Ser Leu Pro Met Pro Ala Gly Ala 
85 90 95 

Ala Ala Ala Asn Tyr Thr Tyr Trp Ala Tyr val Pro Phe Pro Pro Leu 
100 105 110 

lie Arg Ala val Thr Trp Met Asp Asn Pro Thr Glu val Tyr val Asn 
115 120 125 

Asp Ser val Trp Val Pro Gly Pro lie Asp Asp Arg cys Pro Ala Lys 
130 135 140 

Pro Glu Glu Glu Gly Met Met lie Asn lie Ser lie Gly Tyr His Tyr 
145 150 155 160 

Pro Pro lie Cys Leu Gly Arg Ala Pro Gly Cys Leu Met Pro Ala val 
165 170 175 

Gin Asn Trp Leu val Glu Val Pro Thr val Ser Pro lie Cys Arg Phe 
180 185 190 

Thr Tyr His Met Ser Ala Gly val Pro Asn Ser Ser Glu Glu Thr Ala 
195 200 205 

Thr lie Glu Asn Gly Pro 
210 

<210> 48 
<211> 129 
<212> PRT 

<213> Human endogenous retrovirus, K family (HERV-K) 
<400> 48 

Met Asn Pro Ser Glu Met Gin Arg Lys Ala Pro Pro Arg Arg Arg Arg 
15 10 15 

His Arg Asn Arg Ala Pro Leu Thr His Lys Met Asn Lys Met val Thr 
20 25 30 

Ser Glu Glu Gin Met Lys Leu Pro Ser Thr Lys Lys Ala Glu Pro Pro 
35 40 45 

Thr Trp Ala Gin Leu Lys Lys Leu Thr Gin Leu Ala Thr Lys Tyr Leu 
50 55 60 

Glu Asn Thr Lys val Thr Leu Gin Val Tyr Pro Thr Ala Pro Lys Arg 
65 70 75 80 

Gin Arg Pro Ser Arg Thr Gly His Asp Asp Asp Gly Gly Phe val Glu 
85 90 95 
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Lys Lys Arg Gly Lys Cys Gly Glu Lys Gin Glu Arg Ser Asp Cys Tyr 
100 105 110 

Cys Val Cys Val Glu Arg Ser Arg His Arg Arg Leu His Phe val Met 
115 120 125 

Tyr 

<210> 49 
<211> 125 
<212> PRT 

<213> Human endogenous retrovirus, K family (herv-k) 
<400> 49 

Met Asn Pro Ser Glu Met Gin Arg Lys Ala Pro Pro Arg Arg Arg Arg 
15 10 15 

His Arg Asn Arg Ala Pro Leu Thr His Lys Met Asn Lys Met Val Thr 
20 25 30 

Ser Glu Glu Gin Met Lys Leu Pro ser Thr Lys Lys Ala Glu Pro Pro 
35 40 45 

Thr Trp Ala Gin Leu Lys Lys Leu Thr Gin Leu Ala Thr Lys Tyr Leu 
50 55 60 

Glu Asn Thr Lys val Tyr Pro Thr Ala Pro Lys Arg Gin Arg Pro ser 
65 70 75 80 

Arg Thr Gly His Asp Asp Asp Gly Gly Phe val Glu Lys Lys Arg Gly 
85 90 95 

Lys Cys Gly Glu Lys Gin Glu Arg Ser Asp Cys Tyr Cys Val Cys val 
100 105 110 

Glu Arg Ser Arg His Arg Arg Leu His Phe val Met Tyr 
115 120 125 

<210> 50 

<211> 145 

<212> PRT 

<213> Human endogenous retrovirus, k family (herv-k) 
<220> 

<221> SITE 

<222> 64 

<223> xaa is any amino acid 

<400> 50 

Met Asn Pro Ser Glu Met Gin Arg Lys Ala Pro Pro Arg Arg Arg Arg 
1 5 10 15 

His Arg Asn Arg Ala Pro Leu Thr His Lys Met Asn Lys Met val Thr 
20 25 30 

Ser Glu Glu Gin Met Lys Leu Pro Ser Thr Lys Lys Ala Glu Pro Pro 
35 40 45 

Thr Trp Ala Gin Leu Lys Lys Leu Thr Gin Leu Ala Thr Lys Tyr xaa 
50 55 60 

Leu Glu Asn Thr Lys val Thr Gin Thr Pro Glu Ser Met Leu Leu Ala 
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65 70 75 80 

Ala Leu Met lie val Ser Met Val val Tyr Pro Thr Ala Pro Lys Arq 
85 90 95 

Gin Arg Pro Ser Arg Thr Gly His Asp Asp Asp Gly Gly Phe val Glu 
ICQ 105 110 

Lys Lys Arg Gly Lys Cys Gly Glu Lys Gin Glu Arg ser Asp Cys Tyr 
115 120 125 

Cys Val cys val Glu Arg ser Arg His Arg Arg Leu His Phe val Met 
130 135 140 

Tyr 
145 



<210> 
<211> 
<212> 
<213> 

<220> 
<223> 



51 

4657 
DNA 

Artificial sequence 



pCMVKm2 . cORFopt HML-2 vector 



<400> 51 
gccgcggaat 
ttatattggc 
atagtaatca 
acttacggta 
aatgacgtat 
gtatttacgg 
ccctattgac 
acgggacttt 
gcggttttgg 
tctccacccc 
aaaatgtcgt 
ggtctatata 
ctgttttgac 
cattggaacg 
gcacacccct 
cgcttcctta 
tattgaccac 
tgccacaact 
ctctgtattt 
gccgtccccc 
ggtacgtgtt 
tggtcccatg 
gccagactta 
gtagggtatg 
agacttaagg 
tcagaggtaa 
ctcgttgctg 
tccatgggtc 
gcgcaaggcc 
gaacaagatg 
ccccacctgg 
caaggtgacc 
gagcgccggc 
aagaattcag 
gctcgctgat 
cccgtgcctt 
gaaattgcat 



ttcgactcta 
tcatgtccaa 
attacggggt 
aatggcccgc 
gttcccatag 
taaactgccc 
gtcaatgacg 
cctacttggc 
cagtacacca 
attgacgtca 
aataaccccg 
agcagagctc 
ctccatagaa 
cggattcccc 
ttggctctta 
tgctataggt 
tcccctattg 
atctctattg 
ttacaggatg 
gtgcccgcag 
ccggacatgg 
cctccagcgg 
ggcacagcac 
tgtctgaaaa 
cagcggcaga 
ctcccgttgc 
ccgcgcgcgc 
ttttctgcag 
cccccccgcc 
gtgaccagcg 
gcccagctga 
cagacccccg 
gtgcccaaca 
actcgagcaa 
cagcctcgac 
ccttgaccct 
cgcattgtct 



ggccattgca 
tatgaccgcc 
cattagttca 
ctggctgacc 
taacgccaat 
acttggcagt 
gtaaatggcc 
agtacatcta 
atgggcgtgg 
atgggagttt 
ccccgttgac 
gtttagtgaa 
gacaccggga 
gtgccaagag 
tgcatgctat 
gatggtatag 
gtgacgatac 
gctatatgcc 
gggtcccatt 
tttttattaa 
gctcttctcc 
ctcatggtcg 
aatgcccacc 
tgagctcgga 
agaagatgca 
ggtgctgtta 
caccagacat 
tcaccgtcgt 
gccgccgcca 
aggagcagat 
agaagctgac 
agagcatgct 
gcagcgagga 
gtctagaaag 
tgtgccttct 
ggaaggtgcc 
gagtaggtgt 



tacgttgtat 
atgttgacat 
tagcccatat 
gcccaacgac 
agggactttc 
acatcaagtg 
cgcctggcat 
cgtattagtc 
atagcggttt 
gttttggcac 
gcaaatgggc 
ccgtcagatc 
ccgatccagc 
tgacgtaagt 
actgtttttg 
cttagcctat 
tttccattac 
aatactctgt 
tattatttac 
acatagcgtg 
ggtagcggcg 
ctcggcagct 
accaccagtg 
gattgggctc 
ggcagctgag 
acggtggagg 
aatagctgac 
cgacgccacc 
ccgcaaccgc 
gaagctgccc 
ccagctggcc 
gctggccgcc 
gaccgccacc 
ccatggatat 
agttgccagc 
actcccactg 
cattctattc 
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ctatatcata 
tgattattga 
atggagttcc 
ccccgcccat 
cattgacgtc 
tatcatatgc 
tatgcccagt 
atcgctatta 
gactcacggg 
caaaatcaac 
ggtaggcgtg 
gcctggagac 
ctccgcggcc 
accgcctata 
gcttggggcc 
aggtgtgggt 
taatccataa 
ccttcagaga 
aaattcacat 
ggatctccac 
gagcttccac 
ccttgctcct 
tgccgcacaa 
gcaccgctga 
ttgttgtatt 
gcagtgtagt 
agactaacag 
atgaacccca 
gcccccctga 
agcaccaaga 
accaagtacc 
ctgatgatcg 
atcgagaacg 
cggatccact 
catctgttgt 
tcctttccta 
tggggggtgg 
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atatgtacat 
ctagttatta 
gcgttacata 
tgacgtcaat 
aatgggtgga 
caagtccgcc 
acatgacctt 
ccatggtgat 
gatttccaag 
gggactttcc 
tacggtggga 
gccatccacg 
gggaacggtg 
gactctatag 
tatacacccc 
tattgaccat 
catggctctt 
ctgacacgga 
atacaacaac 
gcgaatctcg 
atccgagccc 
aacagtggag 
ggccgtggcg 
cgcagatgga 
ctgataagag 
ctgagcagta 
actgttcctt 
gcgagatgca 
cccacaagat 
aggccgagcc 
tggagaacac 
tgagcatggt 
gccccgctta 
acgcgttaga 
ttgcccctcc 
ataaaatgag 
ggtggggcag 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 
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gacagcaagg gggaggattg ggaagacaat agcagggggg tgggcgaaga actccagcat 2280 

gagatccccg cgctggagga tcatccagcc ggcgtcccgg aaaacgattc cgaagcccaa 2340 

cctttcatag aaggcggcgg tggaatcgaa atctcgtgat ggcaggttgg gcgtcgcttg 2400 

gtcggtcatt tcgaacccca gagtcccgct cagaagaact cgtcaagaag gcgatagaag 2460 

gcgatgcgct gcgaatcggg agcggcgata ccgtaaagca cgaggaagcg gtcagcccat 2520 

tcgccgccaa gctcttcagc aatatcacgg gtagccaacg ctatgtcctg atagcggtcc 2580 

gccacaccca gccggccaca gtcgatgaat ccagaaaagc ggccattttc caccatgata 2640 

ttcggcaagc aggcatcgcc atgggtcacg acgagatcct cgccgtcggg catgcgcgcc 2700 

ttgagcctgg cgaacagttc ggctggcgcg agcccctgat gctcttcgtc cagatcatcc 2760 

tgatcgacaa gaccggcttc catccgagta cgtgctcgct cgatgcgatg tttcgcttgg 2820 

tggtcgaatg ggcaggtagc cggatcaagc gtatgcagcc gccgcattgc atcagccatg 2880 

atggatactt tctcggcagg agcaaggtga gatgacagga gatcctgccc cggcacttcg 2940 

cccaatagca gccagtccct tcccgcttca gtgacaacgt cgagcacagc tgcgcaagga 3000 

acgcccgtcg tggccagcca cgatagccgc gctgcctcgt cctgcagttc attcagggca 3060 

ccggacaggt cggtcttgac aaaaagaacc gggcgcccct gcgctgacag ccggaacacg 3120 

gcggcatcag agcagccgat tgtctgttgt gcccagtcat agccgaatag cctctccacc 3180 

caagcggccg gagaacctgc gtgcaatcca tcttgttcaa tcatgcgaaa cgatcctcat 3240 

cctgtctctt gatcagatct tgatcccctg cgccatcaga tccttggcgg caagaaagcc 3300 

atccagttta ctttgcaggg cttcccaacc ttaccagagg gcgccccagc tggcaattcc 3360 

ggttcgcttg ctgtccataa aaccgcccag tctagctatc gccatgtaag cccactgcaa 3420 

gctacctgct ttctctttgc gcttgcgttt tcccttgtcc agatagccca gtagctgaca 3480 

ttcatccggg gtcagcaccg tttctgcgga ctggctttct acgtgttccg cttcctttag 3540 

cagcccttgc gccctgagtg cttgcggcag cgtgaagcta attcatggtt aaatttttgt 3600 

taaatcagct cattttttaa ccaataggcc gaaatcggca aaatccctta taaatcaaaa 3660 

gaatagcccg agatagggtt gagtgttgtt ccagtttgga acaagagtcc actattaaag 3720 

aacgtggact ccaacgtcaa agggcgaaaa accgtctatc agggcgatgg ccggatcagc 3780 

ttatgcggtg tgaaataccg cacagatgcg taaggagaaa ataccgcatc aggcgctctt 3840 

ccgcttcctc gctcactgac tcgctgcgct cggtcgttcg gctgcggcga gcggtatcag 3900 

ctcactcaaa ggcggtaata cggttatcca cagaatcagg ggataacgca ggaaagaaca 3960 

,tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa ggccgcgttg ctggcgtttt 4020 

tccataggct ccgcccccct gacgagcatc acaaaaatcg acgctcaagt cagaggtggc 4080 

gaaacccgac aggactataa agataccagg cgtttccccc tggaagctcc ctcgtgcgct 4140 

ctcctgttcc gaccctgccg cttaccggat acctgtccgc ctttctccct tcgggaagcg 4200 

tggcgctttc tcatagctca cgctgtaggt atctcagttc ggtgtaggtc gttcgctcca 4260 

agctgggctg tgtgcacgaa ccccccgttc agcccgaccg ctgcgcctta tccggtaact 4320 

atcgtcttga gtccaacccg gtaagacacg acttatcgcc actggcagca gccactggta 4380 

acaggattag cagagcgagg tatgtaggcg gtgctacaga gttcttgaag tggtggccta 4440 

actacggcta cactagaagg acagtatttg gtatctgcgc tctgctgaag ccagttacct 4500 

tcggaaaaag agttggtagc tcttgatccg gcaaacaaac caccgctggt agcggtggtt 4560 

tttttgtttg caagcagcag attacgcgca gaaaaaaagg atctcaagaa gatcctttga 4620 

tcttttctac tgaacggtga tccccaccgg aattgcg 4657 

<210> 52 
<211> 4774 
<212> DMA 

<213> Artificial Sequence 
<220> 

<223> pCMVKm2 . pCAPSopt HML-2 vector 
<400> 52 

gccgcggaat ttcgactcta ggccattgca tacgttgtat ctatatcata atatgtacat 60 

ttatattggc tcatgtccaa tatgaccgcc atgttgacat tgattattga ctagttatta 120 

atagtaatca attacggggt cattagttca tagcccatat atggagttcc gcgttacata 180 

acttacggta aatggcccgc ctggctgacc gcccaacgac ccccgcccat tgacgtcaat 240 

aatgacgtat gttcccatag taacgccaat agggactttc cattgacgtc aatgggtgga 300 

gtatttacgg taaactgccc acttggcagt acatcaagtg tatcatatgc caagtccgcc 360 

ccctattgac gtcaatgacg gtaaatggcc cgcctggcat tatgcccagt acatgacctt 420 

acgggacttt cctacttggc agtacatcta cgtattagtc atcgctatta ccatggtgat 480 

gcggttttgg cagtacacca atgggcgtgg atagcggttt gactcacggg gatttccaag 540 

tctccacccc attgacgtca atgggagttt gttttggcac caaaatcaac gggactttcc 600 

aaaatgtcgt aataaccccg ccccgttgac gcaaatgggc ggtaggcgtg tacggtggga 660 

ggtctatata agcagagctc gtttagtgaa ccgtcagatc gcctggagac gccatccacg 720 
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ctgttttgac ctccatagaa gacaccggga ccgatccagc ctccgcggcc gggaacggtg 780 

cattggaacg cggattcccc gtgccaagag tgacgtaagt accgcctata gactctatag 840 

gcacacccct ttggctctta tgcatgctat actgtttttg gcttggggcc tatacacccc 900 

cgcttcctta tgctataggt gatggtatag cttagcctat aggtgtgggt tattgaccat 960 

tattgaccac tcccctattg gtgacgatac tttccattac taatccataa catggctctt 1020 

tgccacaact atctctattg gctatatgcc aatactctgt ccttcagaga ctgacacgga 1080 

ctctgtattt ttacaggatg gggtcccatt tattatttac aaattcacat atacaacaac 1140 

gccgtccccc gtgcccgcag tttttattaa acatagcgtg ggatctccac gcgaatctcg 1200 

ggtacgtgtt ccggacatgg gctcttctcc ggtagcggcg gagcttccac atccgagccc 1260 

tggtcccatg cctccagcgg ctcatggtcg ctcggcagct ccttgctcct aacagtggag 1320 

gccagactta ggcacagcac aatgcccacc accaccagtg tgccgcacaa ggccgtggcg 1380 

gtagggtatg tgtctgaaaa tgagctcgga gattgggctc gcaccgctga cgcagatgga 1440 

agacttaagg cagcggcaga agaagatgca ggcagctgag ttgttgtatt ctgataagag 1500 

tcagaggtaa ctcccgttgc ggtgctgtta acggtggagg gcagtgtagt ctgagcagta 1560 

ctcgttgctg ccgcgcgcgc caccagacat aatagctgac agactaacag actgttcctt 1620 

tccatgggtc ttttctgcag tcaccgtcgt cgacgccacc atgaacccca gcgagatgca 1680 

gcgcaaggcc cccccccgcc gccgccgcca ccgcaaccgc gcccccctga cccacaagat 1740 

gaacaagatg gtgaccagcg aggagcagat gaagctgccc agcaccaaga aggccgagcc 1800 

ccccacctgg gcccagctga agaagctgac ccagctggcc accaagtacc tggagaacac 1860 

caaggtgacc cagacccccg agagcatgct gctggccgcc ctgatgatcg tgagcatggt 1920 

ggtgtacccc accgccccca agcgccagcg ccccagccgc accggccacg acgacgacgg 1980 

cggcttcgtg gagaagaagc gcggcaagtg cggcgagaag caggagcgca gcgactgcta 2040 

ctgcgtgtgc gtggagcgca gccgccaccg ccgcctgcac ttcgtgctgt acgcttaaag 2100 

aattcagact cgagcaagtc tagaaagcca tggatatcgg atccactacg cgttagagct 2160 

cgctgatcag cctcgactgt gccttctagt tgccagccat ctgttgtttg cccctccccc 2220 

gtgccttcct tgaccctgga aggtgccact cccactgtcc tttcctaata aaatgaggaa 2280 

attgcatcgc attgtctgag taggtgtcat tctattctgg ggggtggggt ggggcaggac 2340 

agcaaggggg aggattggga agacaatagc aggggggtgg gcgaagaact ccagcatgag 2400 

atccccgcgc tggaggatca tccagccggc gtcccggaaa acgattccga agcccaacct 2460 

ttcatagaag gcggcggtgg aatcgaaatc tcgtgatggc aggttgggcg tcgcttggtc 2520 

ggtcatttcg aaccccagag tcccgctcag aagaactcgt caagaaggcg atagaaggcg 2580 

atgcgctgcg aatcgggagc ggcgataccg taaagcacga ggaagcggtc agcccattcg 2640 

ccgccaagct cttcagcaat atcacgggta gccaacgcta tgtcctgata gcggtccgcc 2700 

acacccagcc ggccacagtc gatgaatcca gaaaagcggc cattttccac catgatattc 2760 

ggcaagcagg catcgccatg ggtcacgacg agatcctcgc cgtcgggcat gcgcgccttg 2820 

agcctggcga acagttcggc tggcgcgagc ccctgatgct cttcgtccag atcatcctga 2880 

tcgacaagac cggcttccat ccgagtacgt gctcgctcga tgcgatgttt cgcttggtgg 2940 

tcgaatgggc aggtagccgg atcaagcgta tgcagccgcc gcattgcatc agccatgatg 3000 

gatactttct cggcaggagc aaggtgagat gacaggagat cctgccccgg cacttcgccc 3060 

aatagcagcc agtcccttcc cgcttcagtg acaacgtcga gcacagctgc gcaaggaacg 3120 

cccgtcgtgg ccagccacga tagccgcgct gcctcgtcct gcagttcatt cagggcaccg 3180 

gacaggtcgg tcttgacaaa aagaaccggg cgcccctgcg ctgacagccg gaacacggcg 3240 

gcatcagagc agccgattgt ctgttgtgcc cagtcatagc cgaatagcct ctccacccaa 3300 

gcggccggag aacctgcgtg caatccatct tgttcaatca tgcgaaacga tcctcatcct 3360 

gtctcttgat cagatcttga tcccctgcgc catcagatcc ttggcggcaa gaaagccatc 3420 

cagtttactt tgcagggctt cccaacctta ccagagggcg ccccagctgg caattccggt 3480 

tcgcttgctg tccataaaac cgcccagtct agctatcgcc atgtaagccc actgcaagct 3540 

acctgctttc tctttgcgct tgcgttttcc cttgtccaga tagcccagta gctgacattc 3600 

atccggggtc agcaccgttt ctgcggactg gctttctacg tgttccgctt cctttagcag 3660 

cccttgcgcc ctgagtgctt gcggcagcgt gaagctaatt catggttaaa tttttgttaa 3720 

atcagctcat tttttaacca ataggccgaa atcggcaaaa tcccttataa atcaaaagaa 3780 

tagcccgaga tagggttgag tgttgttcca gtttggaaca agagtccact attaaagaac 3840 

gtggactcca acgtcaaagg gcgaaaaacc gtctatcagg gcgatggccg gatcagctta 3900 

tgcggtgtga aataccgcac agatgcgtaa ggagaaaata ccgcatcagg cgctcttccg 3960 

cttcctcgct cactgactcg ctgcgctcgg tcgttcggct gcggcgagcg gtatcagctc 4020 

actcaaaggc ggtaatacgg ttatccacag aatcagggga taacgcagga aagaacatgt 4080 

gagcaaaagg ccagcaaaag gccaggaacc gtaaaaaggc cgcgttgctg gcgtttttcc 4140 

ataggctccg cccccctgac gagcatcaca aaaatcgacg ctcaagtcag aggtggcgaa 4200 

acccgacagg actataaaga taccaggcgt ttccccctgg aagctccctc gtgcgctctc 4260 

ctgttccgac cctgccgctt accggatacc tgtccgcctt tctcccttcg ggaagcgtgg 4320 

cgctttctca tagctcacgc tgtaggtatc tcagttcggt gtaggtcgtt cgctccaagc 4380 

tgggctgtgt gcacgaaccc cccgttcagc ccgaccgctg cgccttatcc ggtaactatc 4440 

gtcttgagtc caacccggta agacacgact tatcgccact ggcagcagcc actggtaaca 4500 
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ggattagcag agcgaggtat gtaggcggtg ctacagagtt cttgaagtgg tggcctaact 4560 

acggctacac tagaaggaca gtatttggta tctgcgctct gctgaagcca gttaccttcg 4620 

gaaaaagagt tggtagctct tgatccggca aacaaaccac cgctggtagc ggtggttttt 4680 

ttgtttgcaa gcagcagatt acgcgcagaa aaaaaggatc tcaagaagat cctttgatct 4740 

tttctactga acggtgatcc ccaccggaat tgcg 4774 

<210> 53 
<211> 6483 . 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pCMVKm2.gag wt PCAV vector 
<400> 53 

gccgcggaat ttcgactcta ggccattgca tacgttgtat ctatatcata atatgtacat 60 

ttatattggc tcatgtccaa tatgaccgcc atgttgacat tgattattga ctagttatta 120 

atagtaatca attacggggt cattagttca tagcccatat atggagttcc gcgttacata 180 

acttacggta aatggcccgc ctggctgacc gcccaacgac ccccgcccat tgacgtcaat 240 

aatgacgtat gttcccatag taacgccaat agggactttc cattgacgtc aatgggtgga 300 

gtatttacgg taaactgccc acttggcagt acatcaagtg tatcatatgc caagtccgcc 360 

ccctattgac gtcaatgacg gtaaatggcc cgcctggcat tatgcccagt acatgacctt 420 

acgggacttt cctacttggc agtacatcta cgtattagtc atcgctatta ccatggtgat 480 

gcggttttgg cagtacacca atgggcgtgg atagcggttt gactcacggg gatttccaag 540 

tctccacccc attgacgtca atgggagttt gttttggcac caaaatcaac gggactttcc 600 

aaaatgtcgt aataaccccg ccccgttgac gcaaatgggc ggtaggcgtg tacggtggga 660 

ggtctatata agcagagctc gtttagtgaa ccgtcagatc gcctggagac gccatccacg 720 

ctgttttgac ctccatagaa gacaccggga ccgatccagc ctccgcggcc gggaacggtg 780 

cattggaacg cggattcccc gtgccaagag tgacgtaagt accgcctata gactctatag 840 

gcacacccct ttggctctta tgcatgctat actgtttttg gcttggggcc tatacacccc 900 

cgcttcctta tgctataggt gatggtatag cttagcctat aggtgtgggt tattgaccat 960 

tattgaccac tcccctattg gtgacgatac tttccattac taatccataa catggctctt 1020 

tgccacaact atctctattg gctatatgcc aatactctgt ccttcagaga ctgacacgga 1080 

ctctgtattt ttacaggatg gggtcccatt tattatttac aaattcacat atacaacaac 1140 

gccgtccccc gtgcccgcag tttttattaa acatagcgtg ggatctccac gcgaatctcg 1200 

ggtacgtgtt ccggacatgg gctcttctcc ggtagcggcg gagcttccac atccgagccc 1260 

tggtcccatg cctccagcgg ctcatggtcg ctcggcagct ccttgctcct aacagtggag 1320 

gccagactta ggcacagcac aatgcccacc accaccagtg tgccgcacaa ggccgtggcg 1380 

gtagggtatg tgtctgaaaa tgagctcgga gattgggctc gcaccgctga cgcagatgga 1440 

agacttaagg cagcggcaga agaagatgca ggcagctgag ttgttgtatt ctgataagag 1500 

tcagaggtaa ctcccgttgc ggtgctgtta acggtggagg gcagtgtagt ctgagcagta 1560 

ctcgttgctg ccgcgcgcgc caccagacat aatagctgac agactaacag actgttcctt 1620 

tccatgggtc ttttctgcag tcaccgtcgt cgacgccacc atggggcaaa ctgaaagtaa 1680 

atatgcctct tatctcagct ttattaaaat tcttttaaga agagggggag ttagagcttc 1740 

tacagaaaat ctaattacgc tatttcaaac aatagaacaa ttctgcccat ggtttccaga 1800 

acagggaact ttagatctaa aagattggga aaaaattggc aaagaattaa aacaagcaaa 1860 

tagggaaggt aaaatcatcc cacttacagt atggaatgat tgggccatta ttaaagcaac 1920 

tttagaacca tttcaaacag gagaagatat tgtttcagtt tctgatgccc ctaaaagctg 1980 

tgtaacagat tgtgaagaag aggcagggac agaatcccag caaggaacgg aaagttcaca 2040 

ttgtaaatat gtagcagagt ctgtaatggc tcagtcaacg caaaatgttg actacagtca 2100 

attacaggag ataatatacc ctgaatcatc aaaattgggg gaaggaggtc cagaatcatt 2160 

ggggccatca gagcctaaac cacgatcgcc atcaactcct cctcccgtgg ttcagatgcc 2220 

tgtaacatta caacctcaaa cgcaggttag acaagcacaa accccaagag aaaatcaagt 2280 

agaaagggac agagtctcta tcccggcaat gccaactcag atacagtatc cacaatatca 2340 

gccggtagaa aataagaccc aaccgctggt agtttatcaa taccggctgc caaccgagct 2400 

tcagtatcgg cctccttcag aggttcaata cagacctcaa gcggtgtgtc ctgtgccaaa 2460 

tagcacggca ccataccagc aacccacagc gatggcgtct aattcaccag caacacagga 2520 

cgcggcgctg tatcctcagc cgcccactgt gagacttaat cctacagcat cacgtagtgg 2580 

acagggtggt gcactgcatg cagtcattga tgaagccaga aaacagggcg atcttgaggc 2640 

atggcggttc ctggtaattt tacaactggt acaggccggg gaagagactc aagtaggagc 2700 

gcctgcccga gctgagacta gatgtgaacc tttcaccatg aaaatgttaa aagatataaa 2760 

ggaaggagtt aaacaatatg gatccaactc cccttatata agaacattat . tagattccat 2820 

tgctcatgga aatagactta ctccttatga ctgggaaatt ttggccaaat cttccctttc 2880 
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atcctctcag tatctacagt ttaaaacctg gtggattgat ggagtacaag aacaggtacg 2940 

aaaaaatcag gctactaagc ccactgttaa tatagacgca gaccaattgt taggaacagg 3000 

tccaaattgg agcaccatta accaacaatc agtgatgcag aatgaggcta ttgaacaagt 3060 

aagggctatt tgcctcaggg cctggggaaa aattcaggac ccaggaacag ctttccctat 3120 

taattcaatt agacaaggct ctaaagagcc atatcctgac tttgtggcaa gattacaaga 3180 

tgctgctcaa aagtctatta cagatgacaa tgcccgaaaa gttattgtag aattaatggc 3240 

ctatgaaaat gcaaatccag aatgtcagtc ggccataaag ccattaaaag gaaaagttcc 3300 

agcaggagtt gatgtaatta cagaatatgt gaaggcttgt gatgggattg gaggagctat 3360 

gcataaggca atgctaatgg ctcaagcaat gagggggctc actctaggag gacaagttag 3420 

aacatttggg aaaaaatgtt ataattgtgg tcaaatcggt catctgaaaa ggagttgccc 3480 

agtcttaaat aaacagaata taataaatca agctattaca gcaaaaaata aaaagccatc 3540 

tggcctgtgt ccaaaatgtg gaaaaggaaa acattgggcc aatcaatgtc attctaaatt 3600 

tgataaggat gggcaaccat tgtcgggaaa caggaagagg ggccagcctc aggcccccca 3660 

acaaactggg gcattcccag ttcaactgtt tgttcctcag ggttttcaag gacaacaacc 3720 

cctacagaaa ataccaccac ttcagggagt cagccaatta caacaatcca acagctgtcc 3780 

cgcgccacag caggcagcac cgcagtaaga attcagactc gagcaagtct agaaagccat 3840 

ggatatcgga tccactacgc gttagagctc gctgatcagc ctcgactgtg ccttctagtt 3900 

gccagccatc tgttgtttgc ccctcccccg tgccttcctt gaccctggaa ggtgccactc 3960 

ccactgtcct ttcctaataa aatgaggaaa ttgcatcgca ttgtctgagt aggtgtcatt 4020 

ctattctggg gggtggggtg gggcaggaca gcaaggggga ggattgggaa gacaatagca 4080 

ggggggtggg cgaagaactc cagcatgaga tccccgcgct ggaggatcat ccagccggcg 4140 

tcccggaaaa cgattccgaa gcccaacctt tcatagaagg cggcggtgga atcgaaatct 4200 

cgtgatggca ggttgggcgt cgcttggtcg gtcatttcga accccagagt cccgctcaga 4260 

agaactcgtc aagaaggcga tagaaggcga tgcgctgcga atcgggagcg gcgataccgt 4320 

aaagcacgag gaagcggtca gcccattcgc cgccaagctc ttcagcaata tcacgggtag 4380 

ccaacgctat gtcctgatag cggtccgcca cacccagccg gccacagtcg atgaatccag 4440 

aaaagcggcc attttccacc atgatattcg gcaagcaggc atcgccatgg gtcacgacga 4500 

gatcctcgcc gtcgggcatg cgcgccttga gcctggcgaa cagttcggct ggcgcgagcc 4560 

cctgatgctc ttcgtccaga tcatcctgat cgacaagacc ggcttccatc cgagtacgtg 4620 

ctcgctcgat gcgatgtttc gcttggtggt cgaatgggca ggtagccgga tcaagcgtat 4680 

gcagccgccg cattgcatca gccatgatgg atactttctc ggcaggagca aggtgagatg 4740 

acaggagatc ctgccccggc acttcgccca atagcagcca gtcccttccc gcttcagtga 4800 

caacgtcgag cacagctgcg caaggaacgc ccgtcgtggc cagccacgat agccgcgctg 4860 

cctcgtcctg cagttcattc agggcaccgg acaggtcggt cttgacaaaa agaaccgggc 4920 

gcccctgcgc tgacagccgg aacacggcgg catcagagca gccgattgtc tgttgtgccc 4980 

agtcatagcc gaatagcctc tccacccaag cggccggaga acctgcgtgc aatccatctt 5040 

gttcaatcat gcgaaacgat cctcatcctg tctcttgatc agatcttgat cccctgcgcc 5100 

atcagatcct tggcggcaag aaagccatcc agtttacttt gcagggcttc ccaaccttac 5160 

cagagggcgc cccagctggc aattccggtt cgcttgctgt ccataaaacc gcccagtcta 5220 

gctatcgcca tgtaagccca ctgcaagcta cctgctttct ctttgcgctt gcgttttccc 5280 

ttgtccagat agcccagtag ctgacattca tccggggtca gcaccgtttc tgcggactgg 5340 

ctttctacgt gttccgcttc ctttagcagc ccttgcgccc tgagtgcttg cggcagcgtg 5400 

aagctaattc atggttaaat ttttgttaaa tcagctcatt ttttaaccaa taggccgaaa 5460 

tcggcaaaat cccttataaa tcaaaagaat agcccgagat agggttgagt gttgttccag 5520 

tttggaacaa gagtccacta ttaaagaacg tggactccaa cgtcaaaggg cgaaaaaccg 5580 

tctatcaggg cgatggccgg atcagcttat gcggtgtgaa ataccgcaca gatgcgtaag 5640 

gagaaaatac cgcatcaggc gctcttccgc ttcctcgctc actgactcgc tgcgctcggt 5700 

cgttcggctg cggcgagcgg tatcagctca ctcaaaggcg gtaatacggt tatccacaga 5760 

atcaggggat aacgcaggaa agaacatgtg agcaaaaggc cagcaaaagg ccaggaaccg 5820 

taaaaaggcc gcgttgctgg cgtttttcca taggctccgc ccccctgacg agcatcacaa 5880 

aaatcgacgc tcaagtcaga ggtggcgaaa cccgacagga ctataaagat accaggcgtt 5940 

tccccctgga agctccctcg tgcgctctcc tgttccgacc ctgccgctta ccggatacct 6000 

gtccgccttt ctcccttcgg gaagcgtggc gctttctcat agctcacgct gtaggtatct 6060 

cagttcggtg taggtcgttc gctccaagct gggctgtgtg cacgaacccc ccgttcagcc 6120 

cgaccgctgc gccttatccg gtaactatcg tcttgagtcc aacccggtaa gacacgactt 6180 

atcgccactg gcagcagcca ctggtaacag gattagcaga gcgaggtatg taggcggtgc 6240 

tacagagttc ttgaagtggt ggcctaacta cggctacact agaaggacag tatttggtat 6300 

ctgcgctctg ctgaagccag ttaccttcgg aaaaagagtt ggtagctctt gatccggcaa 6360 

acaaaccacc gctggtagcg gtggtttttt tgtttgcaag cagcagatta cgcgcagaaa 6420 

aaaaggatct caagaagatc ctttgatctt ttctactgaa cggtgatccc caccggaatt 6480 

gcg. 6483 

<210> 54 
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<211> 6340 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pCMVKm2.gagopt HML-2 vector 
<400> 54 

gccgcggaat ttcgactcta ggccattgca tacgttgtat ctatatcata atatgtacat 60 

ttatattggc tcatgtccaa tatgaccgcc atgttgacat tgattattga ctagttatta 120 

atagtaatca attacggggt cattagttca tagcccatat atggagttcc gcgttacata 180 

acttacggta aatggcccgc ctggctgacc gcccaacgac ccccgcccat tgacgtcaat 240 

aatgacgtat gttcccatag taacgccaat agggactttc cattgacgtc aatgggtgga 300 

gtatttacgg taaactgccc acttggcagt acatcaagtg tatcatatgc caagtccgcc 360 

ccctattgac gtcaatgacg gtaaatggcc cgcctggcat tatgcccagt acatgacctt 420 

acgggacttt cctacttggc agtacatcta cgtattagtc atcgctatta ccatggtgat 480 

gcggttttgg cagtacacca atgggcgtgg atagcggttt gactcacggg gatttccaag 540 

tctccacccc attgacgtca atgggagttt gttttggcac caaaatcaac gggactttcc 600 

aaaatgtcgt aataaccccg ccccgttgac gcaaatgggc ggtaggcgtg tacggtggga 660 

ggtctatata agcagagctc gtttagtgaa ccgtcagatc gcctggagac gccatccacg 720 

ctgttttgac ctccatagaa gacaccggga ccgatccagc ctccgcggcc gggaacggtg 780 

cattggaacg cggattcccc gtgccaagag tgacgtaagt accgcctata gactctatag 840 

gcacacccct ttggctctta tgcatgctat actgtttttg gcttggggcc tatacacccc 900 

cgcttcctta tgctataggt gatggtatag cttagcctat aggtgtgggt tattgaccat 960 

tattgaccac tcccctattg gtgacgatac tttccattac taatccataa catggctctt 1020 

tgccacaact atctctattg gctatatgcc aatactctgt ccttcagaga ctgacacgga 1080 

ctctgtattt ttacaggatg gggtcccatt tattatttac aaattcacat atacaacaac 1140 

gccgtccccc gtgcccgcag tttttattaa acatagcgtg ggatctccac gcgaatctcg 1200 

ggtacgtgtt ccggacatgg gctcttctcc ggtagcggcg gagcttccac atccgagccc 1260 

tggtcccatg cctccagcgg ctcatggtcg ctcggcagct ccttgctcct aacagtggag 1320 

gccagactta ggcacagcac aatgcccacc accaccagtg tgccgcacaa ggccgtggcg 1380 

gtagggtatg tgtctgaaaa tgagctcgga gattgggctc gcaccgctga cgcagatgga 1440 

agacttaagg cagcggcaga agaagatgca ggcagctgag ttgttgtatt ctgataagag 1500 

tcagaggtaa ctcccgttgc ggtgctgtta acggtggagg gcagtgtagt ctgagcagta 1560 

ctcgttgctg ccgcgcgcgc caccagacat aatagctgac agactaacag actgttcctt 1620 

tccatgggtc ttttctgcag tcaccgtcgt cgacgccacc atgggccaga ccaagagcaa 1680 

gatcaagagc aagtacgcca gctacctgag cttcatcaag atcctgctga agcgcggcgg 1740 

cgtgaaggtg agcaccaaga acctgatcaa gctgttccag atcatcgagc agttctgccc 1800 

ctggttcccc gagcagggca ccctggacct gaaggactgg aagcgcatcg gcaaggagct 1860 

gaagcaggcc ggccgcaagg gcaacatcat ccccctgacc gtgtggaacg actgggccat 1920 

catcaaggcc gccctggagc ccttccagac cgaggaggac agcgtgagcg tgagcgacgc 1980 

ccccggcagc tgcatcatcg actgcaacga gaacacccgc aagaagagcc agaaggagac 2040 

cgagggcctg cactgcgagt acgtggccga gcccgtgatg gcccagagca cccagaacgt 2100 

ggactacaac cagctgcagg aggtgatcta ccccgagacc ctgaagctgg agggcaaggg 2160 

ccccgagctg gtgggcccca gcgagagcaa gccccgcggc accagccccc tgcccgccgg 2220 

ccaggtgccc gtgaccctgc agccccagaa gcaggtgaag gagaacaaga cccagccccc 2280 

cgtggcctac cagtactggc cccccgccga gctgcagtac cgcccccccc ccgagagcca 2340 

gtacggctac cccggcatgc cccccgcccc ccagggccgc gccccctacc cccagccccc 2400 

cacccgccgc ctgaacccca ccgccccccc cagccgccag ggcagcaagc tgcacgagat 2460 

catcgacaag agccgcaagg agggcgacac cgaggcctgg cagttccccg tgaccctgga 2520 

gcccatgccc cccggcgagg gcgcccagga gggcgagccc cccaccgtgg aggcccgcta 2580 

caagagcttc agcatcaaga agctgaagga catgaaggag ggcgtgaagc agtacggccc 2640 

caacagcccc tacatgcgca ccctgctgga cagcatcgcc cacggccacc gcctgatccc 2700 

ctacgactgg gagatcctgg ccaagagcag cctgagcccc agccagttcc tgcagttcaa 2760 

gacctggtgg atcgacggcg tgcaggagca ggtgcgccgc aaccgcgccg ccaacccccc 2820 

cgtgaacatc gacgccgacc agctgctggg catcggccag aactggagca ccatcagcca 2880 

gcaggccctg atgcagaacg aggccatcga gcaggtgcgc gccatctgcc tgcgcgcctg 2940 

ggagaagatc caggaccccg gcagcacctg ccccagcttc aacaccgtgc gccagggcag 3000 

caaggagccc taccccgact tcgtggcccg cctgcaggac gtggcccaga agagcatcgc 3060 

cgacgagaag gcccgcaagg tgatcgtgga gctgatggcc tacgagaacg ccaaccccga 3120 

gtgccagagc gccatcaagc ccctgaaggg caaggtgccc gccggcagcg acgtgatcag 3180 

cgagtacgtg aaggcctgcg acggcatcgg cggcgccatg cacaaggcca tgctgatggc 3240 

ccaggccatc accggcgtgg tgctgggcgg ccaggtgcgc accttcggcc gcaagtgcta 3300 
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caactgcggc cagatcggcc acctgaagaa gaactgcccc gtgctgaaca agcagaacat 3360 

caccatccag gccaccacca ccggccgcga gccccccgac ctgtgccccc gctgcaagaa 3420 

gggcaagcac tgggccagcc agtgccgcag caagttcgac aagaacggcc agcccctgag 3480 

cggcaacgag cagcgcggcc agccccaggc cccccagcag accggcgcct tccccatcca 3540 

gcccttcgtg ccccagggct tccagggcca gcagcccccc ctgagccagg tgttccaggg 3600 

catcagccag ctgccccagt acaacaactg cccccccccc caggccgccg tgcagcaggc 3660 

ttaaagaatt cagactcgag caagtctaga aagccatgga tatcggatcc actacgcgtt 3720 

agagctcgct gatcagcctc gactgtgcct tctagttgcc agccatctgt tgtttgcccc 3780 

tcccccgtgc cttccttgac cctggaaggt gccactccca ctgtcctttc ctaataaaat 3840 

gaggaaattg catcgcattg tctgagtagg tgtcattcta ttctgggggg tggggtgggg 3900 

caggacagca agggggagga ttgggaagac aatagcaggg gggtgggcga agaactccag 3960 

catgagatcc ccgcgctgga ggatcatcca gccggcgtcc cggaaaacga ttccgaagcc 4020 

caacctttca tagaaggcgg cggtggaatc gaaatctcgt gatggcaggt tgggcgtcgc 4080 

ttggtcggtc atttcgaacc ccagagtccc gctcagaaga actcgtcaag aaggcgatag 4140 

aaggcgatgc gctgcgaatc gggagcggcg ataccgtaaa gcacgaggaa gcggtcagcc 4200 

cattcgccgc caagctcttc agcaatatca cgggtagcca acgctatgtc ctgatagcgg 4260 

tccgccacac ccagccggcc acagtcgatg aatccagaaa agcggccatt ttccaccatg 4320 

atattcggca agcaggcatc gccatgggtc acgacgagat cctcgccgtc gggcatgcgc 4380 

gccttgagcc tggcgaacag ttcggctggc gcgagcccct gatgctcttc gtccagatca 4440 

tcctgatcga caagaccggc ttccatccga gtacgtgctc gctcgatgcg atgtttcgct 4500 

tggtggtcga atgggcaggt agccggatca agcgtatgca gccgccgcat tgcatcagcc 4560 

atgatggata ctttctcggc aggagcaagg tgagatgaca ggagatcctg ccccggcact 4620 

tcgcccaata gcagccagtc ccttcccgct tcagtgacaa cgtcgagcac agctgcgcaa 4680 

ggaacgcccg tcgtggccag ccacgatagc cgcgctgcct cgtcctgcag ttcattcagg 4740 

gcaccggaca ggtcggtctt gacaaaaaga accgggcgcc cctgcgctga cagccggaac 4800 

acggcggcat cagagcagcc gattgtctgt tgtgcccagt catagccgaa tagcctctcc 4860 

acccaagcgg ccggagaacc tgcgtgcaat ccatcttgtt caatcatgcg aaacgatcct 4920 

catcctgtct cttgatcaga tcttgatccc ctgcgccatc agatccttgg cggcaagaaa 4980 

gccatccagt ttactttgca gggcttccca accttaccag agggcgcccc agctggcaat 5040 

tccggttcgc ttgctgtcca taaaaccgcc cagtctagct atcgccatgt aagcccactg 5100 

caagctacct gctttctctt tgcgcttgcg ttttcccttg tccagatagc ccagtagctg 5160 

acattcatcc ggggtcagca ccgtttctgc ggactggctt tctacgtgtt ccgcttcctt 5220 

tagcagccct tgcgccctga gtgcttgcgg cagcgtgaag ctaattcatg gttaaatttt 5280 

tgttaaatca gctcattttt taaccaatag gccgaaatcg gcaaaatccc ttataaatca 5340 

aaagaatagc ccgagatagg gttgagtgtt gttccagttt ggaacaagag tccactatta 5400 

aagaacgtgg actccaacgt caaagggcga aaaaccgtct atcagggcga tggccggatc 5460 

agcttatgcg gtgtgaaata ccgcacagat gcgtaaggag aaaataccgc atcaggcgct 5520 

cttccgcttc ctcgctcact gactcgctgc gctcggtcgt tcggctgcgg cgagcggtat 5580 

cagctcactc aaaggcggta atacggttat ccacagaatc aggggataac gcaggaaaga 5640 

acatgtgagc aaaaggccag caaaaggcca ggaaccgtaa aaaggccgcg ttgctggcgt 5700 

ttttccatag gctccgcccc cctgacgagc atcacaaaaa tcgacgctca agtcagaggt 5760 

ggcgaaaccc gacaggacta taaagatacc aggcgtttcc ccctggaagc tccctcgtgc 5820 

gctctcctgt tccgaccctg ccgcttaccg gatacctgtc cgcctttctc ccttcgggaa 5880 

gcgtggcgct ttctcatagc tcacgctgta ggtatctcag ttcggtgtag gtcgttcgct 5940 

ccaagctggg ctgtgtgcac gaaccccccg ttcagcccga ccgctgcgcc ttatccggta 6000 

actatcgtct tgagtccaac ccggtaagac acgacttatc gccactggca gcagccactg 6060 

gtaacaggat tagcagagcg aggtatgtag gcggtgctac agagttcttg aagtggtggc 6120 

ctaactacgg ctacactaga aggacagtat ttggtatctg cgctctgctg aagccagtta 6180 

ccttcggaaa aagagttggt agctcttgat ccggcaaaca aaccaccgct ggtagcggtg 6240 

gtttttttgt ttgcaagcag cagattacgc gcagaaaaaa aggatctcaa gaagatcctt 6300 

tgatcttttc tactgaacgg tgatccccac cggaattgcg 6340 

<210> 55 
<211> 5344 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pCMVKm2.Protopt HML-2 vector 
<400> 55 

gccgcggaat ttcgactcta ggccattgca tacgttgtat ctatatcata atatgtacat 60 

ttatattggc tcatgtccaa tatgaccgcc atgttgacat tgattattga ctagttatta 120 
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atagtaatca attacggggt cattagttca tagcccatat atggagttcc gcgttacata 180 

acttacggta aatggcccgc ctggctgacc gcccaacgac ccccgcccat tgacgtcaat 240 

aatgacgtat gttcccatag taacgccaat agggactttc cattgacgtc aatgggtgga 300 

gtatttacgg taaactgccc acttggcagt acatcaagtg tatcatatgc caagtccgcc 360 

ccctattgac gtcaatgacg gtaaatggcc cgcctggcat tatgcccagt acatgacctt 420 

acgggacttt cctacttggc agtacatcta cgtattagtc atcgctatta ccatggtgat 480 

gcggttttgg cagtacacca atgggcgtgg atagcggttt gactcacggg gatttccaag 540 

tctccacccc attgacgtca atgggagttt gttttggcac caaaatcaac gggactttcc 600 

aaaatgtcgt aataaccccg ccccgttgac gcaaatgggc ggtaggcgtg tacggtggga 660 

ggtctatata agcagagctc gtttagtgaa ccgtcagatc gcctggagac gccatccacg 720 

ctgttttgac ctccatagaa gacaccggga ccgatccagc ctccgcggcc gggaacggtg 780 

cattggaacg cggattcccc gtgccaagag tgacgtaagt accgcctata gactctatag 840 

gcacacccct ttggctctta tgcatgctat actgtttttg gcttggggcc tatacacccc 900 

cgcttcctta tgctataggt gatggtatag cttagcctat aggtgtgggt tattgaccat 960 

tattgaccac tcccctattg gtgacgatac tttccattac taatccataa catggctctt 1020 

tgccacaact atctctattg gctatatgcc aatactctgt ccttcagaga ctgacacgga 1080 

ctctgtattt ttacaggatg gggtcccatt tattatttac aaattcacat atacaacaac 1140 

gccgtccccc gtgcccgcag tttttattaa acatagcgtg ggatctccac gcgaatctcg 1200 

ggtacgtgtt ccggacatgg gctcttctcc ggtagcggcg gagcttccac atccgagccc 1260 

tggtcccatg cctccagcgg ctcatggtcg ctcggcagct ccttgctcct aacagtggag 1320 

gccagactta ggcacagcac aatgcccacc accaccagtg tgccgcacaa ggccgtggcg 1380 

gtagggtatg tgtctgaaaa tgagctcgga gattgggctc gcaccgctga cgcagatgga 1440 

agacttaagg cagcggcaga agaagatgca ggcagctgag ttgttgtatt ctgataagag 1500 

tcagaggtaa ctcccgttgc ggtgctgtta acggtggagg gcagtgtagt ctgagcagta 1560 

ctcgttgctg ccgcgcgcgc caccagacat aatagctgac agactaacag actgttcctt 1620 

tccatgggtc ttttctgcag tcaccgtcgt cgacgccacc atgtgggcca ccatcgtggg 1680 

caagcgcgcc aagggccccg ccagcggccc caccaccaac tggggcatcc ccaacagcgc 1740 

catctgcagc agcggcttca gcggcaccac cacccccacc gtgcccagcg tgagcggcaa 1800 

caagcccgtg accaccatcc agcagctgag ccccgccacc agcggcagcg ccgccgtgga 1860 

cctgtgcacc atccaggccg tgagcctgct gcccggcgag cccccccaga agacccccac 1920 

cggcgtgtac ggccccctgc ccaagggcac cgtgggcctg atcctgggcc gcagcagcct 1980 

gaacctgaag ggcgtgcaga tccacaccag cgtggtggac agcgactaca agggcgagat 2040 

ccagctggtg atcagcagca gcatcccctg gagcgccagc ccccgcgacc gcatcgccca 2100 

gctgctgctg ctgccctaca tcaagggcgg caacagcgag atcaagcgca tcggcggcct 2160 

gggcagcacc gaccccaccg gcaaggccgc ctactgggcc agccaggtga gcgagaaccg 2220 

ccccgtgtgc aaggccatca tccagggcaa gcagttcgag ggcctggtgg acaccggcgc 2280 

cgacgtgagc atcatcgccc tgaaccagtg gcccaagaac tggcccaagc agaaggccgt 2340 

gaccggcctg gtgggcatcg gcaccgccag cgaggtgtac cagagcaccg agatcctgca 2400 

ctgcctgggc cccgacaacc aggagagcac cgtgcagccc atgatcacca gcatccccct 2460 

gaacctgtgg ggccgcgacc tgctgcagca gtggggcgcc gagatcacca tgcccgcccc 2520 

cagctacagc cccaccagcc agaagatcat gaccaagatg ggctacatcc ccggcaaggg 2580 

cctgggcaag aacgaggacg gcatcaagat ccccgtggag gccaagatca accaggagcg 2640 

cgagggcatc ggcaacccct gcgcttaaag aattcagact cgagcaagtc tagaaagcca 2700 

tggatatcgg atccactacg cgttagagct cgctgatcag cctcgactgt gccttctagt 2760 

tgccagccat ctgttgtttg cccctccccc gtgccttcct tgaccctgga aggtgccact 2820 

cccactgtcc tttcctaata aaatgaggaa attgcatcgc attgtctgag taggtgtcat 2880 

tctattctgg ggggtggggt ggggcaggac agcaaggggg aggattggga agacaatagc 2940 

aggggggtgg gcgaagaact ccagcatgag atccccgcgc tggaggatca tccagccggc 3000 

gtcccggaaa acgattccga agcccaacct ttcatagaag gcggcggtgg aatcgaaatc 3060 

tcgtgatggc aggttgggcg tcgcttggtc ggtcatttcg aaccccagag tcccgctcag 3120 

aagaactcgt caagaaggcg atagaaggcg atgcgctgcg aatcgggagc ggcgataccg 3180 

taaagcacga ggaagcggtc agcccattcg ccgccaagct cttcagcaat atcacgggta 3240 

gccaacgcta tgtcctgata gcggtccgcc acacccagcc ggccacagtc gatgaatcca 3300 

gaaaagcggc cattttccac catgatattc ggcaagcagg catcgccatg ggtcacgacg 3360 

agatcctcgc cgtcgggcat gcgcgccttg agcctggcga acagttcggc tggcgcgagc 3420 

ccctgatgct cttcgtccag atcatcctga tcgacaagac cggcttccat ccgagtacgt 3480 

gctcgctcga tgcgatgttt cgcttggtgg tcgaatgggc aggtagccgg atcaagcgta 3540 

tgcagccgcc gcattgcatc agccatgatg gatactttct cggcaggagc aaggtgagat 3600 

gacaggagat cctgccccgg cacttcgccc aatagcagcc agtcccttcc cgcttcagtg 3660 

acaacgtcga gcacagctgc gcaaggaacg cccgtcgtgg ccagccacga tagccgcgct 3720 

gcctcgtcct gcagttcatt cagggcaccg gacaggtcgg tcttgacaaa aagaaccggg 3780 

cgcccctgcg ctgacagccg gaacacggcg gcatcagagc agccgattgt ctgttgtgcc 3840 

cagtcatagc cgaatagcct ctccacccaa gcggccggag aacctgcgtg caatccatct 3900 
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tgttcaatca tgcgaaacga tcctcatcct gtctcttgat cagatcttga tcccctgcgc 
catcagatcc ttggcggcaa gaaagccatc cagtttactt tgcagggctt cccaacctta 
ccagagggcg ccccagctgg caattccggt tcgcttgctg tccataaaac cgcccagtct 
agctatcgcc atgtaagccc actgcaagct acctgctttc tctttgcgct tgcgttttcc 
cttgtccaga tagcccagta gctgacattc atccggggtc agcaccgttt ctgcggactg 
gctttctacg tgttccgctt cctttagcag cccttgcgcc ctgagtgctt gcggcagcgt 
gaagctaatt catggttaaa tttttgttaa atcagctcat tttttaacca ataggccgaa 
atcggcaaaa tcccttataa atcaaaagaa tagcccgaga tagggttgag tgttgttcca 
gtttggaaca agagtccact attaaagaac gtggactcca acgtcaaagg gcgaaaaacc 
gtctatcagg gcgatggccg gatcagctta tgcggtgtga aataccgcac agatgcgtaa 
ggagaaaata ccgcatcagg cgctcttccg cttcctcgct cactgactcg ctgcgctcgg 
tcgttcggct gcggcgagcg gtatcagctc actcaaaggc ggtaatacgg ttatccacag 
aatcagggga taacgcagga aagaacatgt gagcaaaagg ccagcaaaag gccaggaacc 
gtaaaaaggc cgcgttgctg gcgtttttcc ataggctccg cccccctgac gagcatcaca 
aaaatcgacg ctcaagtcag aggtggcgaa acccgacagg actataaaga taccaggcgt 
ttccccctgg aagctccctc gtgcgctctc ctgttccgac cctgccgctt accggatacc 
tgtccgcctt tctcccttcg ggaagcgtgg cgctttctca tagctcacgc tgtaggtatc 
tcagttcggt gtaggtcgtt cgctccaagc tgggctgtgt gcacgaaccc cccgttcagc 
ccgaccgctg cgccttatcc ggtaactatc gtcttgagtc caacccggta agacacgact 
tatcgccact ggcagcagcc actggtaaca ggattagcag agcgaggtat gtaggcggtg 
ctacagagtt cttgaagtgg tggcctaact acggctacac tagaaggaca gtatttggta 
tctgcgctct gctgaagcca gttaccttcg gaaaaagagt tggtagctct tgatccggca 
aacaaaccac cgctggtagc ggtggttttt ttgtttgcaa gcagcagatt acgcgcagaa 
aaaaaggatc tcaagaagat cctttgatct tttctactga acggtgatcc ccaccggaat 
tgcg 

<210> 56 
<211> 7211 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pCMVKm2.Polopt HML-2 vector 
<400> 56 

gccgcggaat ttcgactcta ggccattgca tacgttgtat ctatatcata atatgtacat 
ttatattggc tcatgtccaa tatgaccgcc atgttgacat tgattattga ctagttatta 
atagtaatca attacggggt cattagttca tagcccatat atggagttcc gcgttacata 
acttacggta aatggcccgc ctggctgacc gcccaacgac ccccgcccat tgacgtcaat 
aatgacgtat gttcccatag taacgccaat agggactttc cattgacgtc aatgggtgga 
gtatttacgg taaactgccc acttggcagt acatcaagtg tatcatatgc caagtccgcc 
ccctattgac gtcaatgacg gtaaatggcc cgcctggcat tatgcccagt acatgacctt 
acgggacttt cctacttggc agtacatcta cgtattagtc atcgctatta ccatggtgat 
gcggttttgg cagtacacca atgggcgtgg atagcggttt gactcacggg gatttccaag 
tctccacccc attgacgtca atgggagttt gttttggcac caaaatcaac gggactttcc 
aaaatgtcgt aataaccccg ccccgttgac gcaaatgggc ggtaggcgtg tacggtggga 
ggtctatata agcagagctc gtttagtgaa ccgtcagatc gcctggagac gccatccacg 
ctgttttgac ctccatagaa gacaccggga ccgatccagc ctccgcggcc gggaacggtg 
cattggaacg cggattcccc gtgccaagag tgacgtaagt accgcctata gactctatag 
gcacacccct ttggctctta tgcatgctat actgtttttg gcttggggcc tatacacccc 
cgcttcctta tgctataggt gatggtatag cttagcctat aggtgtgggt tattgaccat 
tattgaccac tcccctattg gtgacgatac tttccattac taatccataa catggctctt 
tgccacaact atctctattg gctatatgcc aatactctgt ccttcagaga ctgacacgga 
ctctgtattt ttacaggatg gggtcccatt tattatttac aaattcacat atacaacaac 
gccgtccccc gtgcccgcag tttttattaa acatagcgtg ggatctccac gcgaatctcg 
ggtacgtgtt ccggacatgg gctcttctcc ggtagcggcg gagcttccac atccgagccc 
tggtcccatg cctccagcgg ctcatggtcg ctcggcagct ccttgctcct aacagtggag 
gccagactta ggcacagcac aatgcccacc accaccagtg tgccgcacaa ggccgtggcg 
gtagggtatg tgtctgaaaa tgagctcgga gattgggctc gcaccgctga cgcagatgga 
agacttaagg cagcggcaga agaagatgca ggcagctgag ttgttgtatt ctgataagag 
tcagaggtaa ctcccgttgc ggtgctgtta acggtggagg gcagtgtagt ctgagcagta 
ctcgttgctg ccgcgcgcgc caccagacat aatagctgac agactaacag actgttcctt 
tccatgggtc ttttctgcag tcaccgtcgt cgacgccacc atgaacaaga gccgcaagcg 
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ccgcaaccgc gagagcctgc tgggcgccgc caccgtggag ccccccaagc ccatccccct 1740 

gacctggaag accgagaagc ccgtgtgggt gaaccagtgg cccctgccca agcagaagct 1800 

ggaggccctg cacctgctgg ccaacgagca gctggagaag ggccacatcg agcccagctt 1860 

cagcccctgg aacagccccg tgttcgtgat ccagaagaag agcggcaagt ggcgcatgct 1920 

gaccgacctg cgcgccgtga acgccgtgat ccagcccatg ggccccctgc agcccggcct 1980 

gcccagcccc gccatgatcc ccaaggactg gcccctgatc atcatcgacc tgaaggactg 2040 

cttcttcacc atccccctgg ccgagcagga ctgcgagaag ttcgccttca ccatccccgc 2100 

catcaacaac aaggagcccg ccacccgctt ccagtggaag gtgctgcccc agggcatgct 2160 

gaacagcccc accatctgcc agaccttcgt gggccgcgcc ctgcagcccg tgcgcgagaa 2220 

gttcagcgac tgctacatca tccactgcat cgacgacatc ctgtgcgccg ccgagaccaa 2280 

ggacaagctg atcgactgct acaccttcct gcaggccgag gtggccaacg ccggcctggc 2340 

catcgccagc gacaagatcc agaccagcac ccccttccac tacctgggca tgcagatcga 2400 

gaaccgcaag atcaagcccc agaagatcga gatccgcaag gacaccctga agaccctgaa 2460 

cgacttccag aagctgctgg gcgacatcaa ctggatccgc cccaccctgg gcatccccac 2520 

ctacgccatg agcaacctgt tcagcatcct gcgcggcgac agcgacctga acagcaagcg 2580 

catgctgacc cccgaggcca ccaaggagat caagctggtg gaggagaaga tccagagcgc 2640 

ccagatcaac cgcatcgacc ccctggcccc cctgcagctg ctgatcttcg ccaccgccca 2700 

cagccccacc ggcatcatca tccagaacac cgacctggtg gagtggagct tcctgcccca 2760 

cagcaccgtg aagaccttca ccctgtacct ggaccagatc gccaccctga tcggccagac 2820 

ccgcctgcgc atcatcaagc tgtgcggcaa cgaccccgac aagatcgtgg tgcccctgac 2880 

caaggagcag gtgcgccagg ccttcatcaa cagcggcgcc tggaagatcg gcctggccaa 2940 

cttcgtgggc atcatcgaca accactaccc caagaccaag atcttccagt tcctgaagct 3000 

gaccacctgg atcctgccca agatcacccg ccgcgagccc ctggagaacg ccctgaccgt 3060 

gttcaccgac ggcagcagca acggcaaggc cgcctacacc ggccccaagg agcgcgtgat 3120 

caagaccccc taccagagcg cccagcgcgc cgagctggtg gccgtgatca ccgtgctgca 3180 

ggacttcgac cagcccatca acatcatcag cgacagcgcc tacgtggtgc aggccacccg 3240 

cgacgtggag accgccctga tcaagtacag catggacgac cagctgaacc agctgttcaa 3300 

cctgctgcag cagaccgtgc gcaagcgcaa cttccccttc tacatcaccc acatccgcgc 3360 

ccacaccaac ctgcccggcc ccctgaccaa ggccaacgag caggccgacc tgctggtgag 3420 

cagcgccctg atcaaggccc aggagctgca cgccctgacc cacgtgaacg ccgccggcct 3480 

gaagaacaag ttcgacgtga cctggaagca ggccaaggac atcgtgcagc actgcaccca 3540 

gtgccaggtg ctgcacctgc ccacccagga ggccggcgtg aacccccgcg gcctgtgccc 3600 

caacgccctg tggcagatgg acgtgaccca cgtgcccagc ttcggccgcc tgagctacgt 3660 

gcacgtgacc gtggacacct acagccactt catctgggcc acctgccaga ccggcgagag 3720 

caccagccac gtgaagaagc acctgctgag ctgcttcgcc gtgatgggcg tgcccgagaa 3780 

gatcaagacc gacaacggcc ccggctactg cagcaaggcc ttccagaagt tcctgagcca 3840 

gtggaagatc agccacacca ccggcatccc ctacaacagc cagggccagg ccatcgtgga 3900 

gcgcaccaac cgcaccctga agacccagct ggtgaagcag aaggagggcg gcgacagcaa 3960 

ggagtgcacc accccccaga tgcagctgaa cctggccctg tacaccctga acttcctgaa 4020 

catctaccgc aaccagacca ccaccagcgc cgagcagcac ctgaccggca agaagaacag 4080 

cccccacgag ggcaagctga tctggtggaa ggacaacaag aacaagacct gggagatcgg 4140 

caaggtgatc acctggggcc gcggcttcgc ctgcgtgagc cccggcgaga accagctgcc 4200 

cgtgtggatc cccacccgcc acctgaagtt ctacaacgag cccatccgcg acgccaagaa 4260 

gagcaccagc gccgagaccg agaccagcca gagcagcacc gtggacagcc aggacgagca 4320 

gaacggcgac gtgcgccgca ccgacgaggt ggccatccac caggagggcc gcgccgccaa 4380 

cctgggcacc accaaggagg ccgacgccgt gagctacaag atcagccgcg agcacaaggg 4440 

cgacaccaac ccccgcgagt acgccgcctg cagcctggac gactgcatca acggcggcaa 4500 

gagcccctac gcctgccgca gcagctgcag cttaaagaat tcagactcga gcaagtctag 4560 

aaagccatgg atatcggatc cactacgcgt tagagctcgc tgatcagcct cgactgtgcc 4620 

ttctagttgc cagccatctg ttgtttgccc ctcccccgtg ccttccttga ccctggaagg 4680 

tgccactccc actgtccttt cctaataaaa tgaggaaatt gcatcgcatt gtctgagtag 4740 

gtgtcattct attctggggg gtggggtggg gcaggacagc aagggggagg attgggaaga 4800 

caatagcagg ggggtgggcg aagaactcca gcatgagatc cccgcgctgg aggatcatcc 4860 

agccggcgtc ccggaaaacg attccgaagc ccaacctttc atagaaggcg gcggtggaat 4920 

cgaaatctcg tgatggcagg ttgggcgtcg cttggtcggt catttcgaac cccagagtcc 4980 

cgctcagaag aactcgtcaa gaaggcgata gaaggcgatg cgctgcgaat cgggagcggc 5040 

gataccgtaa agcacgagga agcggtcagc ccattcgccg ccaagctctt cagcaatatc 5100 

acgggtagcc aacgctatgt cctgatagcg gtccgccaca cccagccggc cacagtcgat 5160 

gaatccagaa aagcggccat tttccaccat gatattcggc aagcaggcat cgccatgggt 5220 

cacgacgaga tcctcgccgt cgggcatgcg cgccttgagc ctggcgaaca gttcggctgg 5280 

cgcgagcccc tgatgctctt cgtccagatc atcctgatcg acaagaccgg cttccatccg 5340 

agtacgtgct cgctcgatgc gatgtttcgc ttggtggtcg aatgggcagg tagccggatc 5400 

aagcgtatgc agccgccgca ttgcatcagc catgatggat actttctcgg caggagcaag 5460 
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gtgagatgac aggagatcct gccccggcac ttcgcccaat agcagccagt cccttcccgc 5520 

ttcagtgaca acgtcgagca cagctgcgca aggaacgccc gtcgtggcca gccacgatag 5580 

ccgcgctgcc tcgtcctgca gttcattcag ggcaccggac aggtcggtct tgacaaaaag 5640 

aaccgggcgc ccctgcgctg acagccggaa cacggcggca tcagagcagc cgattgtctg 5700 

ttgtgcccag tcatagccga atagcctctc cacccaagcg gccggagaac ctgcgtgcaa 5760 

tccatcttgt tcaatcatgc gaaacgatcc tcatcctgtc tcttgatcag atcttgatcc 5820 

cctgcgccat cagatccttg gcggcaagaa agccatccag tttactttgc agggcttccc 5880 

aaccttacca gagggcgccc cagctggcaa ttccggttcg cttgctgtcc ataaaaccgc 5940 

ccagtctagc tatcgccatg taagcccact gcaagctacc tgctttctct ttgcgcttgc 6000 

gttttccctt gtccagatag cccagtagct gacattcatc cggggtcagc accgtttctg 6060 

cggactggct ttctacgtgt tccgcttcct ttagcagccc ttgcgccctg agtgcttgcg 6120 

gcagcgtgaa gctaattcat ggttaaattt ttgttaaatc agctcatttt ttaaccaata 6180 

ggccgaaatc ggcaaaatcc cttataaatc aaaagaatag cccgagatag ggttgagtgt 6240 

tgttccagtt tggaacaaga gtccactatt aaagaacgtg gactccaacg tcaaagggcg 6300 

aaaaaccgtc tatcagggcg atggccggat cagcttatgc ggtgtgaaat accgcacaga 6360 

tgcgtaagga gaaaataccg catcaggcgc tcttccgctt cctcgctcac tgactcgctg 6420 

cgctcggtcg ttcggctgcg gcgagcggta tcagctcact caaaggcggt aatacggtta 6480 

tccacagaat caggggataa cgcaggaaag aacatgtgag caaaaggcca gcaaaaggcc 6540 

aggaaccgta aaaaggccgc gttgctggcg tttttccata ggctccgccc ccctgacgag 6600 

catcacaaaa atcgacgctc aagtcagagg tggcgaaacc cgacaggact ataaagatac 6660 

caggcgtttc cccctggaag ctccctcgtg cgctctcctg ttccgaccct gccgcttacc 6720 

ggatacctgt ccgcctttct cccttcggga agcgtggcgc tttctcatag ctcacgctgt 6780 

aggtatctca gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc 6840 

gttcagcccg accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggtaaga 6900 

cacgacttat cgccactggc agcagccact ggtaacagga ttagcagagc gaggtatgta 6960 

ggcggtgcta cagagttctt gaagtggtgg cctaactacg gctacactag aaggacagta 7020 

tttggtatct gcgctctgct gaagccagtt accttcggaa aaagagttgg tagctcttga 7080 

tccggcaaac aaaccaccgc tggtagcggt ggtttttttg tttgcaagca gcagattacg 7140 

cgcagaaaaa aaggatctca agaagatcct ttgatctttt ctactgaacg gtgatcccca 7200 

ccggaattgc g 7211 

<210> 57 
<211> 318 
<212> DNA 

<213> Human endogenous retrovirus, K family (HERV-K) 
<400> 57 

atgaacccat cagagatgca aagaaaagca cctccgcgga gacggagaca tcgcaatcga 60 

gcaccgttga ctcacaagat gaacaaaatg gtgacgtcag aagaacagat gaagttgcca 120 

tccaccaaga aggcagagcc gccaacttgg gcacaactaa agaagctgac gcagttagct 180 

acaaaatatc tagagaacac aaaggtgaca caaaccccag agagtatgct gcttgcagcc 240 

ttgatgattg tatcaatggt gtctgcaggt gtacccaaca gctccgaaga gacagcgacc 300 

atcgagaacg ggccatga 318 

<210> 58 
<211> 321 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Modified cORF sequence 
<400> 58 

atgaacccca gcgagatgca gcgcaaggcc cccccccgcc gccgccgcca ccgcaaccgc 60 

gcccccctga cccacaagat gaacaagatg gtgaccagcg aggagcagat gaagctgccc 120 

agcaccaaga aggccgagcc ccccacctgg gcccagctga agaagctgac ccagctggcc 180 

accaagtacc tggagaacac caaggtgacc cagacccccg agagcatgct gctggccgcc 240 

ctgatgatcg tgagcatggt gagcgccggc gtgcccaaca gcagcgagga gaccgccacc 300 

atcgagaacg gccccgctta a 321 

<210> 59 
<211> 435 
<212> DNA 
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<213> Human endogenous retrovirus, K family (HERV-K) 

<400> 59 

atgaacccat cggagatgca aagaaaagca cctccgcgga gacggagaca tcgcaatcga 60 

gcaccgttga ctcacaagat gaacaaaatg gtgacgtcag aagaacagat gaagttgcca 120 

tccaccaaga aggcagagcc gccaacttgg gcacaactaa agaagctgac gcagttagct 180 

acaaaatatc tagagaacac aaaggtgaca caaaccccag agagtatgct gcttgcagcc 240 

ttgatgattg tatcaatggt ggtgtaccca acagctccga agagacagcg accatcgaga 300 

acgggccatg atgacgatgg cggttttgtc gaaaagaaaa gggggaaatg tggggaaaag 360 

caagagagat cagattgtta ctgtgtctgt gtagaaagaa gtagacatag gagactccat 420 

tttgttctgt actaa 435 

<210> 60 
<211> 438 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Modified PCAP5 sequence 
<400> 60 

atgaacccca gcgagatgca gcgcaaggcc cccccccgcc gccgccgcca ccgcaaccgc 60 

gcccccctga cccacaagat gaacaagatg gtgaccagcg aggagcagat gaagctgccc 120 

agcaccaaga aggccgagcc ccccacctgg gcccagctga agaagctgac ccagctggcc 180 

accaagtacc tggagaacac caaggtgacc cagacccccg agagcatgct gctggccgcc 240 

ctgatgatcg tgagcatggt ggtgtacccc accgccccca agcgccagcg ccccagccgc 300 

accggccacg acgacgacgg cggcttcgtg gagaagaagc gcggcaagtg cggcgagaag 360 

caggagcgca gcgactgcta ctgcgtgtgc gtggagcgca gccgccaccg ccgcctgcac 420 

ttcgtgctgt acgcttaa 438 

<210> 61 
<211> 2001 
<212> DNA 

<213> Human endogenous retrovirus, K family (HERV-K) 
<400> 61 

atggggcaaa ctaaaagtaa aattaaaagt aaatatgcct cttatctcag ctttattaaa 60 

attcttttaa aaagaggggg agttaaagta tctacaaaaa atctaatcaa gctatttcaa 120 

ataatagaac aattttgccc atggtttcca gaacaaggaa ctttagatct aaaagattgg 180 

aaaagaattg gtaaggaact aaaacaagca ggtaggaagg gtaatatcat tccacttaca 240 

gtatggaatg attgggccat tattaaagca gctttagaac catttcaaac agaagaagat 300 

agcgtttcag tttctgatgc ccctggaagc tgtataatag attgtaatga aaacacaagg 360 

aaaaaatccc agaaagaaac ggaaggttta cattgcgaat atgtagcaga gccggtaatg 420 

gctcagtcaa cgcaaaatgt tgactataat caattacagg aggtgatata tcctgaaacg 480 

ttaaaattag aaggaaaagg tccagaatta gtggggccat cagagtctaa accacgaggc 540 

acaagtcctc ttccagcagg tcaggtgcct gtaacattac aacctcaaaa gcaggttaaa 600 

gaaaataaga cccaaccgcc agtagcctat caatactggc ctccggctga acttcagtat 660 

cggccacccc cagaaagtca gtatggatat ccaggaatgc ccccagcacc acagggcagg 720 

gcgccatacc ctcagccgcc cactaggaga cttaatccta cggcaccacc tagtagacag 780 

ggtagtaaat tacatgaaat tattgataaa tcaagaaagg aaggagatac tgaggcatgg 840 

caattcccag taacgttaga accgatgcca cctggagaag gagcccaaga gggagagcct 900 

cccacagttg aggccagata caagtctttt tcgataaaaa agctaaaaga tatgaaagag 960 

ggagtaaaac agtatggacc caactcccct tatatgagga cattattaga ttccattgct 1020 

catggacata gactcattcc ttatgattgg gagattctgg caaaatcgtc tctctcaccc 1080 

tctcaatttt tacaatttaa gacttggtgg attgatgggg tacaagaaca ggtccgaaga 1140 

aatagggctg ccaatcctcc agttaacata gatgcagatc aactattagg aataggtcaa 1200 

aattggagta ctattagtca acaagcatta atgcaaaatg aggccattga gcaagttaga 1260 

gctatctgcc ttagagcctg ggaaaaaatc caagacccag gaagtacctg cccctcattt 1320 

aatacagtaa gacaaggttc aaaagagccc tatcctgatt ttgtggcaag gctccaagat 1380 

gttgctcaaa agtcaattgc tgatgaaaaa gcccgtaagg tcatagtgga gttgatggca 1440 

tatgaaaacg ccaatcctga gtgtcaatca gccattaagc cattaaaagg aaaggttcct 1500 

gcaggatcag atgtaatctc agaatatgta aaagcctgtg atggaatcgg aggagctatg 1560 

cataaagcta tgcttatggc tcaagcaata acaggagttg ttttaggagg acaagttaga 1620 
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acatttggaa gaaaatgtta taattgtggt caaattggtc acttaaaaaa gaattgccca 
gtcttaaata aacagaatat aactattcaa gcaactacaa caggtagaga gccacctgac 
ttatgtccaa gatgtaaaaa aggaaaacat tgggctagtc aatgtcgttc taaatttgat 
aaaaatgggc aaccattgtc gggaaacgag caaaggggcc agcctcaggc cccacaacaa 
actggggcat tcccaattca gccatttgtt cctcagggtt ttcagggaca acaaccccca 
ctgtcccaag tgtttcaggg aataagccag ttaccacaat acaacaattg tcccccgcca 
caagcggcag tgcagcagta g 

<210> 62 
<211> 2004 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Modified gag sequence 
<400> 62 

atgggccaga ccaagagcaa gatcaagagc aagtacgcca gctacctgag cttcatcaag 
atcctgctga agcgcggcgg cgtgaaggtg agcaccaaga acctgatcaa gctgttccag 
atcatcgagc agttctgccc ctggttcccc gagcagggca ccctggacct gaaggactgg 
aagcgcatcg gcaaggagct gaagcaggcc ggccgcaagg gcaacatcat ccccctgacc 
gtgtggaacg actgggccat catcaaggcc gccctggagc ccttccagac cgaggaggac 
agcgtgagcg tgagcgacgc ccccggcagc tgcatcatcg actgcaacga gaacacccgc 
aagaagagcc agaaggagac cgagggcctg cactgcgagt acgtggccga gcccgtgatg 
gcccagagca cccagaacgt ggactacaac cagctgcagg aggtgatcta ccccgagacc 
ctgaagctgg agggcaaggg ccccgagctg gtgggcccca gcgagagcaa gccccgcggc 
accagccccc tgcccgccgg ccaggtgccc gtgaccctgc agccccagaa gcaggtgaag 
gagaacaaga cccagccccc cgtggcctac cagtactggc cccccgccga gctgcagtac 
cgcccccccc ccgagagcca gtacggctac cccggcatgc cccccgcccc ccagggccgc 
gccccctacc cccagccccc cacccgccgc ctgaacccca ccgccccccc cagccgccag 
ggcagcaagc tgcacgagat catcgacaag agccgcaagg agggcgacac cgaggcctgg 
cagttccccg tgaccctgga gcccatgccc cccggcgagg gcgcccagga gggcgagccc 
cccaccgtgg aggcccgcta caagagcttc agcatcaaga agctgaagga catgaaggag 
ggcgtgaagc agtacggccc caacagcccc tacatgcgca ccctgctgga cagcatcgcc 
cacggccacc gcctgatccc ctacgactgg gagatcctgg ccaagagcag cctgagcccc 
agccagttcc tgcagttcaa gacctggtgg atcgacggcg tgcaggagca ggtgcgccgc 
aaccgcgccg ccaacccccc cgtgaacatc gacgccgacc agctgctggg catcggccag 
aactggagca ccatcagcca gcaggccctg atgcagaacg aggccatcga gcaggtgcgc 
gccatctgcc tgcgcgcctg ggagaagatc caggaccccg gcagcacctg ccccagcttc 
aacaccgtgc gccagggcag caaggagccc taccccgact tcgtggcccg cctgcaggac 
gtggcccaga agagcatcgc cgacgagaag gcccgcaagg tgatcgtgga gctgatggcc 
tacgagaacg ccaaccccga gtgccagagc gccatcaagc ccctgaaggg caaggtgccc 
gccggcagcg acgtgatcag cgagtacgtg aaggcctgcg acggcatcgg cggcgccatg 
cacaaggcca tgctgatggc ccaggccatc accggcgtgg tgctgggcgg ccaggtgcgc 
accttcggcc gcaagtgcta caactgcggc cagatcggcc acctgaagaa gaactgcccc 
gtgctgaaca agcagaacat caccatccag gccaccacca ccggccgcga gccccccgac 
ctgtgccccc gctgcaagaa gggcaagcac tgggccagcc agtgccgcag caagttcgac 
aagaacggcc agcccctgag cggcaacgag cagcgcggcc agccccaggc cccccagcag 
accggcgcct tccccatcca gcccttcgtg ccccagggct tccagggcca gcagcccccc 
ctgagccagg tgttccaggg catcagccag ctgccccagt acaacaactg cccccccccc 
caggccgccg tgcagcaggc ttaa 

<210> 63 
<211> 1005 
<212> DNA 

<213> Human endogenous retrovirus, K family (herv-k) 
<400> 63 

atgtgggcaa ccattgtcgg gaaacgagca aaggggccag cctcaggccc cacaacaaac 
tggggcattc ccaattcagc catttgttcc tcagggtttt cagggacaac aacccccact 
gtcccaagtg tttcagggaa taagccagtt accacaatac aacaattgtc ccccgccaca 
agcggcagtg cagcagtaga tttatgtact atacaagcag tctctctgct tccaggggag 
cccccacaaa aaacccccac aggggtatat ggacccctgc ctaaggggac tgtaggacta 
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atcttgggac gatcaagtct aaatctaaaa ggagttcaaa ttcatactag tgtggttgat 360 

tcagactata aaggcgaaat tcaattggtt attagctctt caattccttg gagtgccagt 420 

ccaagagaca ggattgctca attattactc ctgccataca ttaagggtgg aaatagtgaa 480 

ataaaaagaa taggagggct tggaagcact gatccaacag gaaaggctgc atattgggca 540 

agtcaggtct cagagaacag acctgtgtgt aaggccatta ttcaaggaaa acagtttgaa 600 

gggttggtag acactggagc agatgtctct atcattgctt taaatcagtg gccaaaaaat 660 

tggcctaaac aaaaggctgt tacaggactt gtcggcatag gcacagcctc agaagtgtat 720 

caaagtacgg agattttaca ttgcttaggg ccagataatc aagaaagtac tgttcagcca 780 

atgattactt caattcctct taatctgtgg ggtcgagatt tattacaaca atggggtgcg 840 

gaaatcacca tgcccgctcc atcatatagc cccacgagtc aaaaaatcat gaccaagatg 900 

ggatatatac caggaaaggg actagggaaa aatgaagatg gcattaaaat tccagttgag 960 

gctaaaataa atcaagaaag agaaggaata gggaatcctt gctag 

<210> 64 
<211> 1008 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Modified Prt sequence 
<400> 64 

atgtgggcca ccatcgtggg caagcgcgcc aagggccccg ccagcggccc caccaccaac 
tggggcatcc ccaacagcgc catctgcagc agcggcttca gcggcaccac cacccccacc 
gtgcccagcg tgagcggcaa caagcccgtg accaccatcc agcagctgag ccccgccacc 
agcggcagcg ccgccgtgga cctgtgcacc atccaggccg tgagcctgct gcccggcgag 
cccccccaga agacccccac cggcgtgtac ggccccctgc ccaagggcac cgtgggcctg 
atcctgggcc gcagcagcct gaacctgaag ggcgtgcaga tccacaccag cgtggtggac 
agcgactaca agggcgagat ccagctggtg atcagcagca gcatcccctg gagcgccagc 
ccccgcgacc gcatcgccca gctgctgctg ctgccctaca tcaagggcgg caacagcgag 
atcaagcgca tcggcggcct gggcagcacc gaccccaccg gcaaggccgc ctactgggcc 
agccaggtga gcgagaaccg ccccgtgtgc aaggccatca tccagggcaa gcagttcgag 
ggcctggtgg acaccggcgc cgacgtgagc atcatcgccc tgaaccagtg gcccaagaac 
tggcccaagc agaaggccgt gaccggcctg gtgggcatcg gcaccgccag cgaggtgtac 
cagagcaccg agatcctgca ctgcctgggc cccgacaacc aggagagcac cgtgcagccc 
atgatcacca gcatccccct gaacctgtgg ggccgcgacc tgctgcagca gtggggcgcc 
gagatcacca tgcccgcccc cagctacagc cccaccagcc agaagatcat gaccaagatg 
ggctacatcc ccggcaaggg cctgggcaag aacgaggacg gcatcaagat ccccgtggag 
gccaagatca accaggagcg cgagggcatc ggcaacccct gcgcttaa 

<210> 65 
<211> 2874 
<212> DNA 

<213> Human endogenous retrovirus, K family (HERV-K) 
<400> 65 

atgaataaat caagaaagag aaggaatagg gaatccttgc taggggcggc cactgtagag 
cctcctaaac ccataccatt aacttggaaa acagaaaaac cagtgtgggt aaatcagtgg 
ccgctaccaa aacaaaaact ggaggcttta catttattag caaatgaaca gttagaaaag 
ggtcatattg agccttcgtt ctcaccttgg aattctcctg tgtttgtaat tcagaagaaa 
tcaggcaaat ggcgtatgtt aactgactta agggctgtaa acgccgtaat tcaacccatg 
gggcctctcc aacccgggtt gccctctccg gccatgatcc caaaagattg gcctttaatt 
ataattgatc taaaggattg cttttttacc atccctctgg cagagcagga ttgcgaaaaa 
tttgccttta ctataccagc cataaataat aaagaaccag ccaccaggtt tcagtggaaa 
gtgttacctc agggaatgct taatagtcca actatttgtc agacttttgt aggtcgagct 
cttcaaccag ttagagaaaa gttttcagac tgttatatta ttcattgtat tgatgatatt 
ttatgtgctg cagaaacgaa agataaatta attgactgtt atacatttct gcaagcagag 
gttgccaatg ctggactggc aatagcatct gataagatcc aaacctctac tccttttcat 
tatttaggga tgcagataga aaatagaaaa attaagccac aaaaaataga aataagaaaa 
gacacattaa aaacactaaa tgattttcaa aaattactag gagatattaa ttggattcgg 
ccaactctag gcattcctac ttatgccatg tcaaatttgt tctctatctt aagaggagac 
tcagacttaa atagtaaaag aatgttaacc ccagaggcaa caaaagaaat taaattagtg 
gaagaaaaaa ttcagtcagc gcaaataaat agaatagatc ccttagcccc actccaactt 

Page 62 



Substitute Sequence Listing.USSN 10587032_pp019482.007 
ttgatttttg ccactgcaca ttctccaaca ggcatcatta ttcaaaatac tgatcttgtg 
gagtggtcat tccttcctca cagtacagtt aagactttta cattgtactt ggatcaaata 
gctacattaa tcggtcagac aagattacga ataataaaat tatgtgggaa tgacccagac 
aaaatagttg tccctttaac caaggaacaa gttagacaag cctttatcaa ttctggtgca 
tggaagattg gtcttgctaa ttttgtggga attattgata atcattaccc aaaaacaaag 
atcttccagt tcttaaaatt gactacttgg attctaccta aaattaccag acgtgaacct 
ttagaaaatg ctctaacagt atttactgat ggttccagca atggaaaagc agcttacaca 
ggaccgaaag aacgagtaat caaaactcca tatcaatcgg ctcaaagagc agagttggtt 
gcagtcatta cagtgttaca agattttgac caacctatca atattatatc agattctgca 
tatgtagtac aggctacaag ggatgttgag acagctctaa ttaaatatag catggatgat 
cagttaaacc agctattcaa tttattacaa caaactgtaa gaaaaagaaa tttcccattt 
tatattacac atattcgagc acacactaat ttaccagggc ctttgactaa agcaaatgaa 
caagctgact tactggtatc atctgcactc ataaaagcac aagaacttca tgctttgact 
catgtaaatg cagcaggatt aaaaaacaaa tttgatgtca catggaaaca ggcaaaagat 
attgtacaac attgcaccca gtgtcaagtc ttacacctgc ccactcaaga ggcaggagtt 
aatcccagag gtctgtgtcc taatgcatta tggcaaatgg atgtcacgca tgtaccttca 
tttggaagat tatcatatgt tcacgtaaca gttgatactt attcacattt catatgggca 
acttgccaaa caggagaaag tacttcccat gttaaaaaac atttattgtc ttgttttgct 
gtaatgggag ttccagaaaa aatcaaaact gacaatggac caggatattg tagtaaagct 
ttccaaaaat tcttaagtca gtggaaaatt tcacatacaa caggaattcc ttataattcc 
caaggacagg ccatagttga aagaactaat agaacactca aaactcaatt agttaaacaa 
aaagaagggg gagacagtaa ggagtgtacc actcctcaga tgcaacttaa tctagcactc 
tatactttaa attttttaaa catttataga aatcagacta ctacttctgc agaacaacat 
cttactggta aaaagaacag cccacatgaa ggaaaactaa tttggtggaa agataataaa 
aataagacat gggaaatagg gaaggtgata acgtggggga gaggttttgc ttgtgtttca 
ccaggagaaa atcagcttcc tgtttggata cccactagac atttgaagtt ctacaatgaa 
cccatcagag atgcaaagaa aagcacctcc gcggagacgg agacatcgca atcgagcacc 
gttgactcac aagatgaaca aaatggtgac gtcagaagaa cagatgaagt tgccatccac 
caagaaggca gagccgccaa cttgggcaca actaaagaag ctgacgcagt tagctacaaa 
atatctagag aacacaaagg tgacacaaac cccagagagt atgctgcttg cagccttgat 
gattgtatca atggtggtaa gtctccctat gcctgcagga gcagctgcag ctaa 

<210> 66 
<211> 2877 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Modified pol sequence 
<400> 66 

atgaacaaga gccgcaagcg ccgcaaccgc gagagcctgc tgggcgccgc caccgtggag 
ccccccaagc ccatccccct gacctggaag accgagaagc ccgtgtgggt gaaccagtgg 
cccctgccca agcagaagct ggaggccctg cacctgctgg ccaacgagca gctggagaag 
ggccacatcg agcccagctt cagcccctgg aacagccccg tgttcgtgat ccagaagaag 
agcggcaagt ggcgcatgct gaccgacctg cgcgccgtga acgccgtgat ccagcccatg 
ggccccctgc agcccggcct gcccagcccc gccatgatcc ccaaggactg gcccctgatc 
atcatcgacc tgaaggactg cttcttcacc atccccctgg ccgagcagga ctgcgagaag 
ttcgccttca ccatccccgc catcaacaac aaggagcccg ccacccgctt ccagtggaag 
gtgctgcccc agggcatgct gaacagcccc accatctgcc agaccttcgt gggccgcgcc 
ctgcagcccg tgcgcgagaa gttcagcgac tgctacatca tccactgcat cgacgacatc 
ctgtgcgccg ccgagaccaa ggacaagctg atcgactgct acaccttcct gcaggccgag 
gtggccaacg ccggcctggc catcgccagc gacaagatcc agaccagcac ccccttccac 
tacctgggca tgcagatcga gaaccgcaag atcaagcccc agaagatcga gatccgcaag 
gacaccctga agaccctgaa cgacttccag aagctgctgg gcgacatcaa ctggatccgc 
cccaccctgg gcatccccac ctacgccatg agcaacctgt tcagcatcct gcgcggcgac 
agcgacctga acagcaagcg catgctgacc cccgaggcca ccaaggagat caagctggtg 
gaggagaaga tccagagcgc ccagatcaac cgcatcgacc ccctggcccc cctgcagctg 
ctgatcttcg ccaccgccca cagccccacc ggcatcatca tccagaacac cgacctggtg 
gagtggagct tcctgcccca cagcaccgtg aagaccttca ccctgtacct ggaccagatc 
gccaccctga tcggccagac ccgcctgcgc atcatcaagc tgtgcggcaa cgaccccgac 
aagatcgtgg tgcccctgac caaggagcag gtgcgccagg ccttcatcaa cagcggcgcc 
tggaagatcg gcctggccaa cttcgtgggc atcatcgaca accactaccc caagaccaag 
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atcttccagt tcctgaagct gaccacctgg atcctgccca agatcacccg ccgcgagccc 1380 

ctggagaacg ccctgaccgt gttcaccgac ggcagcagca acggcaaggc cgcctacacc 1440 

ggccccaagg agcgcgtgat caagaccccc taccagagcg cccagcgcgc cgagctggtg 1500 

gccgtgatca ccgtgctgca ggacttcgac cagcccatca acatcatcag cgacagcgcc 1560 

tacgtggtgc aggccacccg cgacgtggag accgccctga tcaagtacag catggacgac 1620 

cagctgaacc agctgttcaa cctgctgcag cagaccgtgc gcaagcgcaa cttccccttc 1680 

tacatcaccc acatccgcgc ccacaccaac ctgcccggcc ccctgaccaa ggccaacgag 1740 

caggccgacc tgctggtgag cagcgccctg atcaaggccc aggagctgca cgccctgacc 1800 

cacgtgaacg ccgccggcct gaagaacaag ttcgacgtga cctggaagca ggccaaggac 1860 

atcgtgcagc actgcaccca gtgccaggtg ctgcacctgc ccacccagga ggccggcgtg 1920 

aacccccgcg gcctgtgccc caacgccctg tggcagatgg acgtgaccca cgtgcccagc 1980 

ttcggccgcc tgagctacgt gcacgtgacc gtggacacct acagccactt catctgggcc 2040 

acctgccaga ccggcgagag caccagccac gtgaagaagc acctgctgag ctgcttcgcc 2100 

gtgatgggcg tgcccgagaa gatcaagacc gacaacggcc ccggctactg cagcaaggcc 2160 

ttccagaagt tcctgagcca gtggaagatc agccacacca ccggcatccc ctacaacagc 2220 

cagggccagg ccatcgtgga gcgcaccaac cgcaccctga agacccagct ggtgaagcag 2280 

aaggagggcg gcgacagcaa ggagtgcacc accccccaga tgcagctgaa cctggccctg 2340 

tacaccctga acttcctgaa catctaccgc aaccagacca ccaccagcgc cgagcagcac 2400 

ctgaccggca agaagaacag cccccacgag ggcaagctga tctggtggaa ggacaacaag 2460 

aacaagacct gggagatcgg caaggtgatc acctggggcc gcggcttcgc ctgcgtgagc 2520 

cccggcgaga accagctgcc cgtgtggatc cccacccgcc acctgaagtt ctacaacgag 2580 

cccatccgcg acgccaagaa gagcaccagc gccgagaccg agaccagcca gagcagcacc 2640 

gtggacagcc aggacgagca gaacggcgac gtgcgccgca ccgacgaggt ggccatccac 2700 

caggagggcc gcgccgccaa cctgggcacc accaaggagg ccgacgccgt gagctacaag 2760 

atcagccgcg agcacaaggg cgacaccaac ccccgcgagt acgccgcctg cagcctggac 2820 

gactgcatca acggcggcaa gagcccctac gcctgccgca gcagctgcag cgcttaa 2877 

<210> 67 
<211> 106 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Manipulated cORF 
<400> 67 

Met Asn Pro Ser Glu Met Gin Arg Lys Ala Pro Pro Arg Arg Arg Arg 
15 10 15 

His Arg Asn Arg Ala Pro Leu Thr His Lys Met Asn Lys Met Val Thr 
20 25 30 

ser Glu Glu Gin Met Lys Leu Pro ser Thr Lys Lys Ala Glu Pro Pro 
35 40 45 

Thr Trp Ala Gin Leu Lys Lys Leu Thr Gin Leu Ala Thr Lys Tyr Leu 
50 55 60 

Glu Asn Thr Lys val Thr Gin Thr Pro Glu Ser Met Leu Leu Ala Ala 
65 70 75 80 

Leu Met lie val Ser Met val Ser Ala Gly val Pro Asn Ser Ser Glu 
85 90 95 

Glu Thr Ala Thr lie Glu Asn Gly Pro Ala 
100 105 

<210> 68 
<211> 145 
<212> PRT 

<213> Artificial Sequence 
<220> 
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<223> Manipulated PCAP5 

<400> 68 

Met Asn Pro Ser Glu Met Gin Arg Lys Ala Pro Pro Arg Arg Arg Arg 

15 10 15 

His Arg Asn Arg Ala Pro Leu Thr His Lys Met Asn Lys Met Val Thr 
20 25 30 

Ser Glu Glu Gin Met Lys Leu Pro Ser Thr Lys Lys Ala Glu Pro Pro 
35 40 45 

Thr Trp Ala Gin Leu Lys Lys Leu Thr Gin Leu Ala Thr Lys Tyr Leu 
50 55 60 

Glu Asn Thr Lys val Thr Gin Thr Pro Glu ser Met Leu Leu Ala Ala 
65 70 75 80 

Leu Met lie val ser Met val val Tyr Pro Thr Ala Pro Lys Arg Gin 
85 90 95 

Arg Pro ser Arg Thr Gly His Asp Asp Asp Gly Gly Phe val Glu Lys 
100 105 110 

Lys Arg Gly Lys Cys Gly Glu Lys Gin Glu Arg ser Asp Cys Tyr Cys 
115 120 125 

val eys val Glu Arg ser Arg His Arg Arg Leu His Phe val Leu Tyr 
130 135 140 

Ala 
145 

<210> 69 
<211> 666 
<212> PRT 

<213> Human endogenous retrovirus, K family (herv-k) 
<400> 69 

Met Gly Gin Thr Lys Ser Lys lie Lys Ser Lys Tyr Ala ser Tyr Leu 

15 10 15 

ser Phe lie Lys lie Leu Leu Lys Arg Gly Gly val Lys val Ser Thr 
20 25 30 

Lys Asn Leu lie Lys Leu Phe Gin lie lie Glu Gin Phe Cys Pro Trp 
35 40 45 

Phe Pro Glu Gin Gly Thr Leu Asp Leu Lys Asp Trp Lys Arg lie Gly 
50 55 60 

Lys Glu Leu Lys Gin Ala Gly Arg Lys Gly Asn lie lie Pro Leu Thr 
65 70 75 80 

val Trp Asn Asp Trp Ala lie lie Lys Ala Ala Leu Glu Pro Phe Gin 
85 90 95 

Thr Glu Glu Asp Ser Val Ser val Ser Asp Ala Pro Gly Ser Cys lie 
100 105 110 

lie Asp Cys Asn Glu Asn Thr Arg Lys Lys Ser Gin Lys Glu Thr Glu 
115 120 125 
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Gly Leu His Cys Glu Tyr val Ala Glu Pro val Met Ala Gin Ser Thr 
130 135 140 

Gin Asn val Asp Tyr Asn Gin Leu Gin Glu Val lie Tyr Pro Glu Thr 
145 150 155 160 

Leu Lys Leu Glu Gly Lys Gly Pro Glu Leu Val Gly Pro Ser Glu Ser 
165 170 175 

Lys Pro Arg Gly Thr Ser Pro Leu Pro Ala Gly Gin val Pro Val Thr 
180 185 190 

Leu Gin Pro Gin Lys Gin val Lys Glu Asn Lys Thr Gin Pro Pro Val 

195 200 205 

Ala Tyr Gin Tyr Trp Pro Pro Ala Glu Leu Gin Tyr Arg Pro Pro Pro 
210 215 220 

Glu Ser Gin Tyr Gly Tyr Pro Gly Met Pro Pro Ala Pro Gin Gly Arg 
225 230 235 240 

Ala Pro Tyr Pro Gin Pro Pro Thr Arg Arg Leu Asn Pro Thr Ala Pro 

245 250 255 

Pro Ser Arg Gin Gly Ser Lys Leu His Glu lie lie Asp Lys Ser Arg 
260 265 270 

Lys Glu Gly Asp Thr Glu Ala Trp Gin Phe Pro Val Thr Leu Glu Pro 
275 280 285 

Met Pro Pro Gly Glu Gly Ala Gin Glu Gly Glu Pro Pro Thr Val Glu 

290 295 300 

Ala Arg Tyr Lys Ser Phe Ser lie Lys Lys Leu Lys Asp Met Lys Glu 
305 310 315 320 

Gly Val Lys Gin Tyr Gly Pro Asn Ser Pro Tyr Met Arg Thr Leu Leu 
325 330 335 

Asp Ser lie Ala His Gly His Arg Leu lie Pro Tyr Asp Trp Glu lie 
340 345 350 

Leu Ala Lys Ser ser Leu Ser Pro Ser Gin Phe Leu Gin Phe Lys Thr 
355 360 365 

Trp Trp lie Asp Gly Val Gin Glu Gin Val Arg Arg Asn Arg Ala Ala 
370 375 380 

Asn Pro Pro val Asn lie Asp Ala Asp Gin Leu Leu Gly lie Gly Gin 
385 390 395 400 

Asn Trp ser Thr lie Ser Gin Gin Ala Leu Met Gin Asn Glu Ala lie 
405 410 415 

Glu Gin val Arg Ala lie Cys Leu Arg Ala Trp Glu Lys He Gin Asp 
420 425 430 

Pro Gly Ser Thr Cys Pro Ser Phe Asn Thr Val Arg Gin Gly Ser Lys 
435 440 445 

Glu Pro Tyr Pro Asp Phe val Ala Arg Leu Gin Asp val Ala Gin Lys 
450 455 460 
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Ser lie Ala Asp Glu Lys Ala Arg Lys val lie val Glu Leu Met Ala 
465 470 475 480 

Tyr Glu Asn Ala Asn Pro Glu Cys Gin ser Ala lie Lys Pro Leu Lys 
485 490 495 

Gly Lys val Pro Ala Gly Ser Asp val lie Ser Glu Tyr val Lys Ala 
500 505 510 

Cys Asp Gly He Gly Gly Ala Met His Lys Ala Met Leu Met Ala Gin 
515 520 525 

Ala lie Thr Gly val Val Leu Gly Gly Gin val Arg Thr Phe Gly Arg 
530 535 540 

Lys Cys Tyr Asn Cys Gly Gin lie Gly His Leu Lys Lys Asn Cys Pro 
545 550 555 560 

Val Leu Asn Lys Gin Asn lie Thr He Gin Ala Thr Thr Thr Gly Arg 
565 570 575 

Glu Pro Pro Asp Leu Cys Pro Arg cys Lys Lys Gly Lys His Trp Ala 
580 585 590 

Ser Gin Cys Arg Ser Lys Phe Asp Lys Asn Gly Gin Pro Leu Ser Gly 
595 600 605 

Asn Glu Gin Arg Gly Gin Pro Gin Ala Pro Gin Gin Thr Gly Ala Phe 
610 615 620 

Pro lie Gin Pro Phe val Pro Gin Gly Phe Gin Gly Gin Gin Pro Pro 
625 630 635 640 

Leu Ser Gin Val Phe Gin Gly lie Ser Gin Leu Pro Gin Tyr Asn Asn 
645 650 655 

Cys Pro Pro Pro Gin Ala Ala val Gin Gin 
660 665 

<210> 70 
<211> 667 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Manipulated Gag 
<400> 70 

Met Gly Gin Thr Lys Ser Lys lie Lys ser Lys Tyr Ala ser Tyr Leu 
15 10 15 

Ser Phe lie Lys lie Leu Leu Lys Arg Gly Gly val Lys val ser Thr 
20 25 30 

Lys Asn Leu lie Lys Leu Phe Gin lie lie Glu Gin Phe Cys Pro Trp 
35 40 45 

Phe Pro Glu Gin Gly Thr Leu Asp Leu Lys Asp Trp Lys Arg lie Gly 
50 55 60 

Lys Glu Leu Lys Gin Ala Gly Arg Lys Gly Asn He He Pro Leu Thr 
65 70 75 80 
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val Trp Asn Asp Trp Ala lie lie Lys Ala Ala Leu Glu Pro Phe Gin 
85 90 95 

Thr Glu Glu Asp Ser val ser val Ser Asp Ala Pro Gly Ser cys lie 
100 105 110 

lie Asp cys Asn Glu Asn Thr Arg Lys Lys ser Gin Lys Glu Thr Glu 
115 120 125 

Gly Leu His Cys Glu Tyr val Ala Glu Pro Val Met Ala Gin ser Thr 
130 135 140 

Gin Asn val Asp Tyr Asn Gin Leu Gin Glu Val lie Tyr Pro Glu Thr 
145 150 155 160 

Leu Lys Leu Glu Gly Lys Gly Pro Glu Leu val Gly Pro Ser Glu Ser 
165 170 175 

Lys Pro Arg Gly Thr Ser Pro Leu Pro Ala Gly Gin Val Pro Val Thr 
180 185 190 

Leu Gin Pro Gin Lys Gin Val Lys Glu Asn Lys Thr Gin Pro Pro Val 

195 200 205 

Ala Tyr Gin Tyr Trp pro Pro Ala Glu Leu Gin Tyr Arg Pro Pro Pro 
210 215 220 

Glu Ser Gin Tyr Gly Tyr Pro Gly Met Pro Pro Ala Pro Gin Gly Arg 
225 230 235 240 

Ala Pro Tyr Pro Gin Pro Pro Thr Arg Arg Leu Asn Pro Thr Ala Pro 

245 250 255 

Pro Ser Arg Gin Gly Ser Lys Leu His Glu lie lie Asp Lys Ser Arg 
260 265 270 

Lys Glu Gly Asp Thr Glu Ala Trp Gin Phe Pro val Thr Leu Glu Pro 
275 280 285 

Met Pro Pro Gly Glu Gly Ala Gin Glu Gly Glu Pro Pro Thr Val Glu 

290 295 300 

Ala Arg Tyr Lys Ser Phe Ser lie Lys Lys Leu Lys Asp Met Lys Glu 
305 310 315 320 

Gly Val Lys Gin Tyr Gly Pro Asn ser Pro Tyr Met Arg Thr Leu Leu 
325 330 335 

Asp ser lie Ala His Gly His Arg Leu lie Pro Tyr Asp Trp Glu lie 

340 345 350 

Leu Ala Lys Ser ser Leu Ser Pro ser Gin Phe Leu Gin Phe Lys Thr 
355 360 365 

Trp Trp lie Asp Gly Val Gin Glu Gin Val Arg Arg Asn Arg Ala Ala 
370 375 380 

Asn Pro Pro val Asn lie Asp Ala Asp Gin Leu Leu Gly lie Gly Gin 
385 390 395 400 

Asn Trp ser Thr lie Ser Gin Gin Ala Leu Met Gin Asn Glu Ala lie 
405 410 415 

Page 68 



substitute sequence Listing_USSN 10587032_PP019482.007 
Glu Gln va1 Arg Ala i1e Cys Leu Arg Ala Trp Glu Lys l1e Gln Asp 
420 425 430 

Pro Gly Ser Thr Cys Pro Ser Phe Asn Thr val Arg Gln Gly Ser Lys 
435 440 445 

Glu pro Tyr Pro Asp Phe Val Ala Arg Leu Gln Asp Val Ala Gln Lys 
450 455 460 

Ser i1e Ala Asp Glu Lys Ala Arg Lys val lie val Glu Leu Met Ala 
465 470 475 480 

Tyr Glu Asn Ala Asn Pro Glu Cys Gln ser Ala lie Lys Pro Leu Lys 
485 490 495 

Gly Lys Val Pro Ala Gly Ser Asp val lie ser Glu Tyr val Lys Ala 
500 505 510 

Cys Asp Gly lie Gly Gly Ala Met His Lys Ala Met Leu Met Ala Gln 
515 520 525 

Ala lie Thr Gly val val Leu Gly Gly Gln val Arg Thr Phe Gly Arg 

530 535 540 

Lys Cys Tyr Asn Cys Gly Gln lie Gly His Leu Lys Lys Asn Cys Pro 
545 550 555 560 

val Leu Asn Lys Gln Asn lie Thr lie Gln Ala Thr Thr Thr Gly Arg 
565 570 575 

Glu Pro Pro Asp Leu Cys Pro Arg cys Lys Lys Gly Lys His Trp Ala 
580 585 590 

Ser Gln Cys Arg Ser Lys Phe Asp Lys Asn Gly Gln Pro Leu Ser Gly 
595 600 605 

Asn Glu Gln Arg Gly Gln Pro Gln Ala Pro Gln Gln Thr Gly Ala Phe 
610 615 620 

Pro He Gln Pro Phe val Pro Gln Gly Phe Gln Gly Gln Gln Pro Pro 
625 630 635 640 

Leu Ser Gln Val Phe Gln Gly lie Ser Gln Leu Pro Gln Tyr Asn Asn 
645 650 655 

Cys Pro Pro Pro Gln Ala Ala val Gln Gln Ala 
660 665 

<210> 71 
<211> 334 
<212> PRT 

<213> Human endogenous retrovirus, K family (HERV-K) 
<400> 71 

Met Trp Ala Thr lie val Gly Lys Arg Ala Lys Gly Pro Ala Ser Gly 
1 5 10 15 

Pro Thr Thr Asn Trp Gly lie Pro Asn Ser Ala lie cys Ser Ser Gly 

20 25 30 

Phe ser Gly Thr Thr Thr Pro Thr val Pro ser val Ser Gly Asn Lys 
35 40 45 
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Pro val Thr Thr lie Gin Gin Leu Ser Pro Ala Thr Ser Gly Ser Ala 
50 S5 60 

Ala val Asp Leu Cys Thr lie Gin Ala val ser Leu Leu Pro Gly Glu 
65 70 75 80 

Pro Pro Gin Lys Thr Pro Thr Gly val Tyr Gly Pro Leu Pro Lys Gly 
85 90 95 

Thr Val Gly Leu lie Leu Gly Arg ser Ser Leu Asn Leu Lys Gly Val 
100 105 110 

Gin lie His Thr Ser val Val Asp Ser Asp Tyr Lys Gly Glu lie Gin 
115 120 125 

Leu Val lie Ser Ser Ser lie Pro Trp Ser Ala Ser Pro Arg Asp Arg 
130 135 140 

lie Ala Gin Leu Leu Leu Leu Pro Tyr lie Lys Gly Gly Asn Ser Glu 
145 150 155 160 

lie Lys Arg lie Gly Gly Leu Gly Ser Thr Asp Pro Thr Gly Lys Ala 
165 170 175 

Ala Tyr Trp Ala Ser Gin val Ser Glu Asn Arg Pro val Cys Lys Ala 
180 185 190 

lie lie Gin Gly Lys Gln Phe Glu Gly Leu val Asp Thr Gly Ala Asp 
195 200 205 

val Ser lie lie Ala Leu Asn Gin Trp Pro Lys Asn Trp Pro Lys Gin 
210 215 220 

Lys Ala Val Thr Gly Leu val Gly lie Gly Thr Ala Ser Glu Val Tyr 
225 230 235 240 

Gin Ser Thr Glu lie Leu His Cys Leu Gly Pro Asp Asn Gin Glu ser 
245 250 255 

Thr val Gin Pro Met lie Thr ser ile Pro Leu Asn Leu Trp Gly Arg 
260 265 270 

Asp Leu Leu Gin Gin Trp Gly Ala Glu Ile Thr Met Pro Ala Pro Ser 
275 280 285 

Tyr ser Pro Thr ser Gin Lys Ile Met Thr Lys Met Gly Tyr lie Pro 
290 295 300 

Gly Lys Gly Leu Gly Lys Asn Glu Asp Gly Ile Lys lie Pro val Glu 
305 310 315 320 

Ala Lys Ile Asn Gin Glu Arg Glu Gly Ile Gly Asn Pro cys 
325 330 

<210> 72 
<211> 335 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Manipulated Prt 
<400> 72 
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Met Trp Ala Thr lie Val Gly Lys Arg Ala Lys Gly Pro Ala Ser Gly 
15 10 15 

Pro Thr Thr Asn Trp Gly lie Pro Asn ser Ala lie Cys Ser Ser Gly 
20 25 30 

Phe Ser Gly Thr Thr Thr Pro Thr Val Pro Ser Val Ser Gly Asn Lys 
35 40 45 

Pro Val Thr Thr lie Gin Gin Leu Ser Pro Ala Thr Ser Gly ser Ala 
50 55 60 

Ala val Asp Leu Cys Thr He Gin Ala Val Ser Leu Leu Pro Gly Glu 
65 70 75 80 

Pro Pro Gin Lys Thr Pro Thr Gly val Tyr Gly Pro Leu Pro Lys Gly 
85 90 95 

Thr val Gly Leu lie Leu Gly Arg ser Ser Leu Asn Leu Lys Gly val 
100 105 110 

Gin lie His Thr Ser Val val Asp Ser Asp Tyr Lys Gly Glu lie Gin 
115 120 125 

Leu val lie ser ser Ser lie Pro Trp Ser Ala Ser Pro Arg Asp Arg 
130 135 140 

lie Ala Gin Leu Leu Leu Leu Pro Tyr lie Lys Gly Gly Asn Ser Glu 
145 150 155 160 

lie Lys Arg lie Gly Gly Leu Gly Ser Thr Asp Pro Thr Gly Lys Ala 
165 170 175 

Ala Tyr Trp Ala Ser Gin Val Ser Glu Asn Arg Pro val Cys Lys Ala 
180 185 190 

lie He Gin Gly Lys Gin Phe Glu Gly Leu val Asp Thr Gly Ala Asp 
195 200 205 

Val Ser lie He Ala Leu Asn Gin Trp Pro Lys Asn Trp Pro Lys Gin 

210 215 220 

Lys Ala val Thr Gly Leu val Gly lie Gly Thr Ala Ser Glu Val Tyr 
225 230 235 240 

Gin Ser Thr Glu lie Leu His Cys Leu Gly Pro Asp Asn Gin Glu Ser 
245 250 255 

Thr val Gin Pro Met lie Thr Ser lie Pro Leu Asn Leu Trp Gly Arg 

260 265 270 

Asp Leu Leu Gin Gin Trp Gly Ala Glu lie Thr Met Pro Ala Pro Ser 
275 280 285 

Tyr Ser Pro Thr Ser Gin Lys lie Met Thr Lys Met Gly Tyr lie Pro 
290 295 300 

Gly Lys Gly Leu Gly Lys Asn Glu Asp Gly lie Lys lie Pro Val Glu 
305 310 315 320 

Ala Lys lie Asn Gin Glu Arg Glu Gly lie Gly Asn Pro Cys Ala 
325 330 335 
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<210> 73 
<211> 957 
<212> PRT 

<213> Human endogenous retrovirus, K family (herv-K) 
<400> 73 

Met Asn Lys Ser Arg Lys Arg Arg Asn Arg Glu Ser Leu Leu Gly Ala 
15 10 15 

Ala Thr Val Glu Pro Pro Lys Pro lie pro Leu Thr Trp Lys Thr Glu 
20 25 30 

Lys Pro val Trp val Asn Gin Trp Pro Leu Pro Lys Gin Lys Leu Glu 
35 40 45 

Ala Leu His Leu Leu Ala Asn Glu Gin Leu Glu Lys Gly His lie Glu 
50 55 60 

Pro ser Phe ser Pro Trp Asn ser Pro val Phe val lie Gin Lys Lys 
65 70 75 80 

ser Gly Lys Trp Arg Met Leu Thr Asp Leu Arg Ala Val Asn Ala val 
85 90 95 

lie Gin Pro Met Gly Pro Leu Gin Pro Gly Leu Pro Ser Pro Ala Met 
100 105 110 

lie Pro Lys Asp Trp Pro Leu lie lie lie Asp Leu Lys Asp cys Phe 
115 120 125 

Phe Thr lie Pro Leu Ala Glu Gin Asp Cys Glu Lys Phe Ala Phe Thr 
130 135 140 

lie Pro Ala lie Asn Asn Lys Glu Pro Ala Thr Arg Phe Gin Trp Lys 
145 150 155 160 

Val Leu Pro Gin Gly Met Leu Asn ser Pro Thr lie Cys Gin Thr Phe 
165 170 175 

val Gly Arg Ala Leu Gin Pro val Arg Glu Lys Phe Ser Asp cys Tyr 
180 185 190 

lie lie His Cys lie Asp Asp lie Leu Cys Ala Ala Glu Thr Lys Asp 
195 200 205 

Lys Leu lie Asp Cys Tyr Thr Phe Leu Gin Ala Glu Val Ala Asn Ala 
210 215 220 

Gly Leu Ala lie Ala ser Asp Lys lie Gin Thr Ser Thr Pro Phe His 

225 230 235 240 

Tyr Leu Gly Met Gin lie Glu Asn Arg Lys lie Lys Pro Gin Lys lie 
245 250 255 

Glu lie Arg Lys Asp Thr Leu Lys Thr Leu Asn Asp Phe Gin Lys Leu 
260 265 270 

Leu Gly Asp lie Asn Trp lie Arg Pro Thr Leu Gly lie Pro Thr Tyr 
275 280 285 

Ala Met ser Asn Leu Phe Ser lie Leu Arg Gly Asp ser Asp Leu Asn 
290 295 300 
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Ser Lys Arg Met Leu Thr Pro Glu Ala Thr Lys Glu lie Lys Leu Val 
305 310 315 320 

Glu Glu Lys lie Gin ser Ala Gin lie Asn Arg lie Asp Pro Leu Ala 
325 330 335 

Pro Leu Gin Leu Leu lie Phe Ala Thr Ala His Ser Pro Thr Gly lie 
340 345 350 

lie lie Gin Asn Thr Asp Leu val Glu Trp Ser Phe Leu Pro His Ser 
355 360 365 

Thr val Lys Thr Phe Thr Leu Tyr Leu Asp Gin lie Ala Thr Leu lie 
370 375 380 

Gly Gin Thr Arg Leu Arg lie lie Lys Leu Cys Gly Asn Asp Pro Asp 
385 390 395 400 

Lys lie val val Pro Leu Thr Lys Glu Gin val Arg Gin Ala Phe lie 
405 410 415 

Asn ser Gly Ala Trp Lys lie Gly Leu Ala Asn Phe val Gly lie lie 
420 425 430 

Asp Asn His Tyr Pro Lys Thr Lys lie Phe Gin Phe Leu Lys Leu Thr 
435 440 445 

Thr Trp lie Leu Pro Lys lie Thr Arg Arg Glu Pro Leu Glu Asn Ala 
450 455 460 

Leu Thr Val Phe Thr Asp Gly Ser Ser Asn Gly Lys Ala Ala Tyr Thr 
465 470 475 480 

Gly Pro Lys Glu Arg Val lie Lys Thr Pro Tyr Gin ser Ala Gin Arg 
485 490 495 

Ala Glu Leu val Ala val lie Thr val Leu Gin Asp Phe Asp Gin Pro 
500 505 510 

lie Asn He lie ser Asp ser Ala Tyr val val Gin Ala Thr Arg Asp 

515 520 525 

val Glu Thr Ala Leu lie Lys Tyr Ser Met Asp Asp Gin Leu Asn Gin 
530 535 540 

Leu Phe Asn Leu Leu Gin Gin Thr val Arg Lys Arg Asn Phe Pro Phe 
545 550 555 560 

Tyr lie Thr His lie Arg Ala His Thr Asn Leu Pro Gly Pro Leu Thr 

565 570 575 

Lys Ala Asn Glu Gin Ala Asp Leu Leu val Ser Ser Ala Leu lie Lys 
580 585 590 

Ala Gin Glu Leu His Ala Leu Thr His val Asn Ala Ala Gly Leu Lys 
595 600 605 

Asn Lys Phe Asp Val Thr Trp Lys Gin Ala Lys Asp lie val Gin His 
610 615 620 

Cys Thr Gin Cys Gin val Leu His Leu Pro Thr Gin Glu Ala Gly val 
625 630 635 640 
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Asn Pro Arg Gly Leu cys Pro Asn Ala Leu Trp Gin Met Asp val Thr 
645 650 655 

His Val Pro Ser Phe Gly Arg Leu Ser Tyr val His val Thr val Asp 

660 665 670 

Thr Tyr ser His Phe lie Trp Ala Thr cys Gin Thr Gly Glu Ser Thr 
675 680 685 

Ser His val Lys Lys His Leu Leu ser Cys Phe Ala val Met Gly val 
690 695 700 

Pro Glu Lys lie Lys Thr Asp Asn Gly Pro Gly Tyr Cys Ser Lys Ala 

705 710 715 720 

Phe Gin Lys Phe Leu Ser Gin Trp Lys lie Ser His Thr Thr Gly lie 
725 730 735 

Pro Tyr Asn Ser Gin Gly Gin Ala lie Val Glu Arg Thr Asn Arg Thr 
740 745 750 

Leu Lys Thr Gin Leu val Lys Gin Lys Glu Gly Gly Asp ser Lys Glu 
755 760 765 

Cys Thr Thr Pro Gin Met Gin Leu Asn Leu Ala Leu Tyr Thr Leu Asn 
770 775 780 

Phe Leu Asn lie Tyr Arg Asn Gin Thr Thr Thr ser Ala Glu Gin His 
785 790 795 800 

Leu Thr Gly Lys Lys Asn Ser Pro His Glu Gly Lys Leu lie Trp Trp 

805 810 815 

Lys Asp Asn Lys Asn Lys Thr Trp Glu lie Gly Lys val lie Thr Trp 
820 825 830 

Gly Arg Gly Phe Ala Cys Val Ser Pro Gly Glu Asn Gin Leu Pro val 
835 840 845 

Trp lie Pro Thr Arg His Leu Lys Phe Tyr Asn Glu Pro lie Arg Asp 
850 855 860 

Ala Lys Lys Ser Thr ser Ala Glu Thr Glu Thr Ser Gin ser ser Thr 
865 870 875 880 

Val Asp Ser Gin Asp Glu Gin Asn Gly Asp val Arg Arg Thr Asp Glu 
885 890 895 

val Ala lie His Gin Glu Gly Arg Ala Ala Asn Leu Gly Thr Thr Lys 

900 905 910 

Glu Ala Asp Ala val Ser Tyr Lys lie Ser Arg Glu His Lys Gly Asp 
915 920 925 

Thr Asn Pro Arg Glu Tyr Ala Ala Cys Ser Leu Asp Asp Cys lie Asn 
930 935 940 

Gly Gly Lys Ser Pro Tyr Ala Cys Arg ser Ser Cys Ser 
945 950 955 

<210> 74 
<211> 958 
<212> PRT 
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<213> Artificial Sequence 

<220> 

<223> Manipulated Pol 
<400> 74 

Met Asn Lys Ser Arg Lys Arg Arg Asn Arg Glu Ser Leu Leu Gly Ala 
15 10 15 

Ala Thr val Glu Pro Pro Lys Pro lie Pro Leu Thr Trp Lys Thr Glu 
20 25 30 

Lys Pro val Trp Val Asn Gin Trp Pro Leu Pro Lys Gin Lys Leu Glu 
35 40 45 

Ala Leu His Leu Leu Ala Asn Glu Gin Leu Glu Lys Gly His lie Glu 
50 55 60 

Pro Ser Phe Ser Pro Trp Asn Ser Pro Val Phe val lie Gin Lys Lys 
65 70 75 80 

Ser Gly Lys Trp Arg Met Leu Thr Asp Leu Arg Ala Val Asn Ala val 
85 90 95 

lie Gin Pro Met Gly Pro Leu Gin Pro Gly Leu Pro Ser Pro Ala Met 
100 105 110 

lie Pro Lys Asp Trp Pro Leu lie lie lie Asp Leu Lys Asp cys Phe 
115 120 125 

Phe Thr lie Pro Leu Ala Glu Gin Asp cys Glu Lys Phe Ala Phe Thr 

130 135 140 

lie Pro Ala lie Asn Asn Lys Glu Pro Ala Thr Arg Phe Gin Trp Lys 
145 150 155 160 

val Leu Pro Gin Gly Met Leu Asn ser Pro Thr lie cys Gin Thr Phe 
165 170 175 

Val Gly Arg Ala Leu Gin Pro Val Arg Glu Lys Phe Ser Asp Cys Tyr 
180 185 190 

lie lie His Cys lie Asp Asp lie Leu Cys Ala Ala Glu Thr Lys Asp 
195 200 205 

Lys Leu lie Asp Cys Tyr Thr Phe Leu Gin Ala Glu val Ala Asn Ala 
210 215 220 

Gly Leu Ala lie Ala ser Asp Lys lie Gin Thr ser Thr Pro Phe His 
225 230 235 240 

Tyr Leu Gly Met Gin lie Glu Asn Arg Lys lie Lys Pro Gin Lys lie 
245 250 255 

Glu lie Arg Lys Asp Thr Leu Lys Thr Leu Asn Asp Phe Gin Lys Leu 
260 265 270 

Leu Gly Asp lie Asn Trp lie Arg Pro Thr Leu Gly lie Pro Thr Tyr 
275 280 285 

Ala Met ser Asn Leu Phe Ser lie Leu Arg Gly Asp Ser Asp Leu Asn 
290 295 300 
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Ser Lys Arg Met Leu Thr Pro Glu Ala Thr Lys Glu lie Lys Leu val 
305 310 315 320 

Glu Glu Lys lie Gin ser Ala Gin lie Asn Arg lie Asp Pro Leu Ala 

325 330 335 

Pro Leu Gin Leu Leu lie Phe Ala Thr Ala His Ser Pro Thr Gly lie 
340 345 350 

lie lie Gin Asn Thr Asp Leu val Glu Trp ser Phe Leu Pro His Ser 
355 360 365 

Thr val Lys Thr Phe Thr Leu Tyr Leu Asp Gin lie Ala Thr Leu lie 

370 375 380 

Gly Gin Thr Arg Leu Arg lie lie Lys Leu Cys Gly Asn Asp Pro Asp 
385 390 395 400 

Lys lie val val pro Leu Thr Lys Glu Gin val Arg Gin Ala Phe lie 
405 410 415 

Asn Ser Gly Ala Trp Lys lie Gly Leu Ala Asn Phe val Gly lie He 
420 425 430 

Asp Asn His Tyr Pro Lys Thr Lys lie Phe Gin Phe Leu Lys Leu Thr 
435 440 445 

Thr Trp lie Leu Pro Lys lie Thr Arg Arg Glu Pro Leu Glu Asn Ala 
450 455 460 

Leu Thr val Phe Thr Asp Gly Ser Ser Asn Gly Lys Ala Ala Tyr Thr 
465 470 475 480 

Gly Pro Lys Glu Arg val lie Lys Thr Pro Tyr Gin Ser Ala Gin Arg 
485 490 495 

Ala Glu Leu val Ala val lie Thr val Leu Gin Asp Phe Asp Gin Pro 
500 505 510 

lie Asn lie lie Ser Asp ser Ala Tyr val val Gin Ala Thr Arg Asp 
515 520 525 

Val Glu Thr Ala Leu He Lys Tyr Ser Met Asp Asp Gin Leu Asn Gin 
530 535 540 

Leu Phe Asn Leu Leu Gin Gin Thr val Arg Lys Arg Asn Phe Pro Phe 
545 550 555 560 

Tyr lie Thr His He Arg Ala His Thr Asn Leu Pro Gly Pro Leu Thr 

565 570 575 

Lys Ala Asn Glu Gin Ala Asp Leu Leu val ser ser Ala Leu lie Lys 
580 585 590 

Ala Gin Glu Leu His Ala Leu Thr His Val Asn Ala Ala Gly Leu Lys 
595 600 605 

Asn Lys Phe Asp val Thr Trp Lys Gin Ala Lys Asp lie Val Gin His 

610 615 620 

Cys Thr Gin Cys Gin val Leu His Leu Pro Thr Gin Glu Ala Gly val 
625 630 635 640 
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Asn Pro Arg Gly Leu Cys Pro Asn Ala Leu Trp Gin Met Asp Val Thr 
64S 650 655 

His Val Pro Ser Phe Gly Arg Leu ser Tyr Val His val Thr val Asp 
660 665 670 

Thr Tyr Ser His Phe lie Trp Ala Thr Cys Gin Thr Gly Glu Ser Thr 
675 680 685 

Ser His val Lys Lys His Leu Leu Ser Cys Phe Ala val Met Gly val 
690 695 700 

Pro Glu Lys lie Lys Thr Asp Asn Gly Pro Gly Tyr Cys Ser Lys Ala 
705 710 715 720 

Phe Gin Lys Phe Leu Ser Gin Trp Lys lie Ser His Thr Thr Gly lie 
725 730 735 

Pro Tyr Asn Ser Gin Gly Gin Ala lie Val Glu Arg Thr Asn Arg Thr 
740 745 750 

Leu Lys Thr Gin Leu Val Lys Gin Lys Glu Gly Gly Asp ser Lys Glu 

755 760 765 

Cys Thr Thr Pro Gin Met Gin Leu Asn Leu Ala Leu Tyr Thr Leu Asn 
770 775 780 

Phe Leu Asn lie Tyr Arg Asn Gin Thr Thr Thr Ser Ala Glu Gin His 
785 790 795 800 

Leu Thr Gly Lys Lys Asn Ser Pro His Glu Gly Lys Leu lie Trp Trp 
805 810 815 

Lys Asp Asn Lys Asn Lys Thr Trp Glu lie Gly Lys Val lie Thr Trp 
820 825 830 

Gly Arg Gly Phe Ala Cys Val Ser Pro Gly Glu Asn Gin Leu Pro val 
835 840 845 

Trp lie Pro Thr Arg His Leu Lys Phe Tyr Asn Glu Pro lie Arg Asp 
850 855 860 

Ala Lys Lys Ser Thr ser Ala Glu Thr Glu Thr Ser Gin Ser Ser Thr 
865 870 875 880 

val Asp ser Gin Asp Glu Gin Asn Gly Asp Val Arg Arg Thr Asp Glu 
885 890 895 

val Ala lie His Gin Glu Gly Arg Ala Ala Asn Leu Gly Thr Thr Lys 
900 905 910 

Glu Ala Asp Ala Val Ser Tyr Lys He Ser Arg Glu His Lys Gly Asp 
915 920 925 

Thr Asn Pro Arg Glu Tyr Ala Ala Cys Ser Leu Asp Asp cys lie Asn 
930 935 940 

Gly Gly Lys Ser Pro Tyr Ala Cys Arg Ser Ser cys Ser Ala 
945 950 955 

<210> 75 
<211> 12366 
<212> DNA 
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<213> Human endogenous retrovirus, K family (herv-k), located at 22qll.2 

<400> 75 

tgtggggaaa agaaagagag atcagactgt tactgtgtct atgtagaaag aaatagacat 60 

aagagactcc attttgttct gtactaagaa aaattcttct gctttgagat gctgttaatc 120 

tgtaacccta gccccaaccc tgtgctcaca gaaacaggtg ctgtgttgac tcaaggttta 180 

atggattcag ggctgtgcag gatgtgcttt gttaaacaaa tgcttgaagg cagcaagctt 240 

gttaagagtc atcaccactc cctaatctca agtaagcagg gacacaaaca ctgcggaagg 300 

ccgcagggac ctctgcctag gaaagccagg tgttgtccaa ggtttctccc catgtgacag 360 

tctgaaatat ggcctcttgg gaagggaaag acctgactgt cccctggccc gacacccgta 420 

aagggtctgt gctgaggatt agtaaaagag gaaggaaggc ctctttgcag ttgagataag 480 

aggaaggcat ctgtctcctg ctcatccctg ggcaatggaa tgtcttggtg taaagcctga 540 

ttgtatatgc catctactga gataggagaa aactgcctta gggctggagg tgggacatgc 600 

tggcggcaat actgctcttt aaggcattga gatgtttatg tatatgcaca tcaaaagcac 660 

agcacttttt tctttacctt gtttatgatg cagagacatt tgttcacatg ttttcctgct 720 

ggccctctcc ccactattac cctattgtcc tgccacatcc ccctctccga gatggtagag 780 

ataatgatca ataaatactg agggaactca gagaccggtg cggcgcgggt cctccatatg 840 

ctgagcgccg gtcccctggg cccacttttc tttctctata ctttgtctct gttgtctttc 900 

ttttctcaag tctctcgttc cacctgagga gaaatgccca cagctgtgga ggcgcaggcc 960 

actccatctg gtgcccaacg tggatgcttt tctctagggt gaagggactc tcgagtgtgg 1020 

tcattgagga caagtcaacg agagattccc gagtacgtct acagtgagcc ttgtggtaag 1080 

cttgggcgct cggaagaagc cagggttaat ggggcaaact aaaagtaaag tctctcattc 1140 

cacctgatga gaaacaccca gaggtgtgga ggggcaggcc accccttcag ggtagggtcc 1200 

cctccatgca gaccatagag cacaggtgtg ccccaaagag gagcagagag aaggagggag 1260 

agggcccacg agagacttgg aaatgaatgg caggatttta ggcgctggac ttgggttcgg 1320 

ggcacctggc ctttccttgt gtatttctcc tactgtctgc ctaactattt aatacaataa 1380 

aagaaaacca gcccctggtt cttgtggtgt ttccaccctc ccgggtcccc gctggctgcc 1440 

tggcttcctc ccgcagctcc tgctgtgtgt gtatgtgtgt gtgtgtgcac atctgtgggg 1500 

cgtatgtgtg ttcgtctttg taattgaggc tgcagagtgg agagagcagg ggttttctct 1560 

ggggacccag agagaaggag gcgttttcac cacagccgaa cagggcagga ccccagcacc 1620 

cgggacccag cgggactttg ccaaggggat ggacctggct gggccacgcg gctgtttgtg 1680 

tagggaaaag aaagagagat cacactgtta ctgtgtctat gtagaaaagg aagacataaa 1740 

ctccattttg agctgtacta agaaaaatta ttttgccttg acctgctgtt aacctgtaac 1800 

tgtagcccca accctgtgct caaagaaaca tgtgctgtat ggaatcaagg tttaagggat 1860 

caagggctgt acaggatgtg ccttgttaac aatgtgttta caggcagtat gcttggtaaa 1920 

agtcatcgcc attctccatt ctccattaat caggggcacg atgcactgcg gaaagccaca 1980 

gggacctctg cccgagaaag cctgggtatt gtccaaggct tccccccact gagacagcct 2040 

gagatacggc ctcgtgggaa gggaaagacc tgaccgtccc ccagcccgac acccgtaaag 2100 

ggtctgtgct gaggaggatt agtaaaaggg gaaggcctct tgcagttgag ataagaggaa 2160 

ggcctccgtc tcctgcatgt ccttgggaat ggaatgtctt ggtgtaaaac ccgatagtac 2220 

attccttcta ttctgagaga agaaaaccac cctgtggctg gaggtgagat atgctagcgg 2280 

caatgctgct ctgttactct ttgctacact gagatgtttg ggtggagaga agcataaatc 2340 

tggcctatgt gcacatctgg gcacagaacc tccccttgaa cttgtgacac agattccttt 2400 

gttcacatgt tttcctgctg accttctccc cactatcgcc ctgttctccc accgcattcc 2460 

ccttgctgag atagtgaaaa tagtaatctg tagataccaa gggaactcag agaccatggc 2520 

cggtgcacat cctccgtacg ctgagcgctg gtcccctggg cccattgttc tttctctata 2580 

ctttgtctct gtgtcttatt tctttcctca gtctctcatc cctcctgacg agaaataccc 2640 

acaggtgtgg aggggctggc ccccttcatc tgatgcccaa tgtgggtgcc tttctctagg 2700 

gtgaaggtac tctacagtgt ggtcattgag gacaagttga cgagagagtc ccaagtacgt 2760 

ccacggtcag ccttgcggta agcttgtgtg cttagaggaa cccagggtaa cgatggggca 2820 

aactgaaagt aaatatgcct cttatctcag ctttattaaa attcttttaa gaagaggggg 2880 

agttagagct tctacagaaa atctaattac gctatttcaa acaatagaac aattctgccc 2940 

atggtttcca gaacagggaa ctttagatct aaaagattgg gaaaaaattg gcaaagaatt 3000 

aaaacaagca aatagggaag gtaaaatcat cccacttaca gtatggaatg attgggccat 3060 

tattaaagca actttagaac catttcaaac aggagaagat attgtttcag tttctgatgc 3120 

ccctaaaagc tgtgtaacag attgtgaaga agaggcaggg acagaatccc agcaaggaac 3180 

ggaaagttca cattgtaaat atgtagcaga gtctgtaatg gctcagtcaa cgcaaaatgt 3240 

tgactacagt caattacagg agataatata ccctgaatca tcaaaattgg gggaaggagg 3300 

tccagaatca ttggggccat cagagcctaa accacgatcg ccatcaactc ctcctcccgt 3360 

ggttcagatg cctgtaacat tacaacctca aacgcaggtt agacaagcac aaaccccaag 3420 

agaaaatcaa gtagaaaggg acagagtctc tatcccggca atgccaactc agatacagta 3480 

tccacaatat cagccggtag aaaataagac ccaaccgctg gtagtttatc aataccggct 3540 

gccaaccgag cttcagtatc ggcctccttc agaggttcaa tacagacctc aagcggtgtg 3600 
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tcctgtgcca aatagcacgg caccatacca gcaacccaca gcgatggcgt ctaattcacc 3660 

agcaacacag gacgcggcgc tgtatcctca gccgcccact gtgagactta atcctacagc 3720 

atcacgtagt ggacagggtg gtgcactgca tgcagtcatt gatgaagcca gaaaacaggg 3780 

cgatcttgag gcatggcggt tcctggtaat tttacaactg gtacaggccg gggaagagac 3840 

tcaagtagga gcgcctgccc gagctgagac tagatgtgaa cctttcacca tgaaaatgtt 3900 

aaaagatata aaggaaggag ttaaacaata tggatccaac tccccttata taagaacatt 3960 

attagattcc attgctcatg gaaatagact tactccttat gactgggaaa ttttggccaa 4020 

atcttccctt tcatcctctc agtatctaca gtttaaaacc tggtggattg atggagtaca 4080 

agaacaggta cgaaaaaatc aggctactaa gcccactgtt aatatagacg cagaccaatt 4140 

gttaggaaca ggtccaaatt ggagcaccat taaccaacaa tcagtgatgc agaatgaggc 4200 

tattgaacaa gtaagggcta tttgcctcag ggcctgggga aaaattcagg acccaggaac 4260 

agctttccct attaattcaa ttagacaagg ctctaaagag ccatatcctg actttgtggc 4320 

aagattacaa gatgctgctc aaaagtctat tacagatgac aatgcccgaa aagttattgt 4380 

agaattaatg gcctatgaaa atgcaaatcc agaatgtcag tcggccataa agccattaaa 4440 

aggaaaagtt ccagcaggag ttgatgtaat tacagaatat gtgaaggctt gtgatgggat 4500 

tggaggagct atgcataagg caatgctaat ggctcaagca atgagggggc tcactctagg 4560 

aggacaagtt agaacatttg ggaaaaaatg ttataattgt ggtcaaatcg gtcatctgaa 4620 

aaggagttgc ccaggcttaa ataaacagaa tataataaat caagctatta cagcaaaaaa 4680 

taaaaagcca tctggcctgt gtccaaaatg tggaaaagca aaacattggg ccaatcaatg 4740 

tcattctaaa tttgataaag atgggcaacc attgtctgga aacaggaaga ggggccagcc 4800 

tcaggccccc caacaaactg gggcattccc agttaaactg tttgttcctc agggttttca 4860 

aggacaacaa cccctacaga aaataccacc acttcaggga gtcagccaat tacaacaatc 4920 

caacagctgt cccgcgccac agcaggcagc accgcagtag atttatgttc cacccaaatg 4980 

gtctttttac tccctggaaa gcccccacaa aagattccta gaggggtata tggcccgctg 5040 

ccagaaggga gggtaggcct ttgagggaga tcaagtctaa atttgaaggg agtccaaatt 5100 

catactgggg taatttattc agattataaa gggggaattc agttagtgat cagctccact 5160 

gttccccgga gtgccaatcc aggtgataga attgctcaat tactgctttt gccttatgtt 5220 

aaaattgggg aaaacaaaaa ggaaagaaca ggagggtttg gaagtaccaa ccctgcagga 5280 

aaagctgctt attgggctaa tcaggtctca gaggatagac ccgtgtgtac agtcactatt 5340 

cagggaaaga gtttgaagga ttagtggata cccaggctga tgtttctgtc atcggcatag 5400 

gtactgcctc agaagtgtat caaagtgcca tgattttaca ttgtccagga tctgataatc 5460 

aagaaagtac ggttcagcct gtgatcactt cattccaatc aatttatggg gccgagactt 5520 

gttacaacaa tggcatgcag agattactat cccagcctcc ctatacagcc ccaggaataa 5580 

aaaaatcatg actaaaatgg gatagctccc taaaaaggga ctaggaaaga agtcccaatt 5640 

gaggctgaaa aaaatcaaaa aagaaaagga atagggcatc ctttttagga gcggtcactg 5700 

tagagcctcc aaaacccatt ccattaactt gggggaaaaa aaaacaactg tatggtaaat 5760 

cagcagcgct tccaaaacaa aaactggagg ctttacattt attagcaaag aaacaattag 5820 

aaaaaggaca ttgagccttc attttcgcct tggaattctg tttgtaattc agaaaaaatc 5880 

cggcagatgg cgtataatgc cgtaattcaa cccatggggg ctctcccacc ccggttgccc 5940 

tctccagcca tggtcccctt taattataat tgatctgaag gattgctttt ttaccattcc 6000 

tctggcaaaa caggattttg aaaaatttgc ttttaccaca ccagcctaaa taataaagaa 6060 

ccagccacca ggtttcagtg gaaagtattg cctcagggaa tgcttaatag ttcaactatt 6120 

tgtcagctca agctctgcaa ccagttagag acaagttttc agactgttac atcgttcact 6180 

atgttgatat tttgtgtgct gcagaaacga gagacaaatt aattgaccgt tacacatttc 6240 

tgcagacaga ggttgccaac gcgggactga caataacatc tgataagatt caaacctcta 6300 

ctcctttccg ttacttggga atgcaggtag aggaaaggaa aattaaacca caaaaaatag 6360 

aaataagaaa agacacatta aaagcattaa atgagtttca aaagttgcta ggagatacta 6420 

attggatttg gagatattaa ttggatttgg ccaactctag gcattcctac ttatgccatg 6480 

tcaaatttgt tctctttctt aagaggggac tcggaattaa atagtgaaag aacgttaact 6540 

ccagaggcaa ctaaagaaat taaattaatt gaagaaaaaa ttcggtcagc acaagtaaat 6600 

agaatagatc acttggcccc actccaaatt ttgatttttg ctactgcaca ttccctaaca 6660 

ggcatcattg ttcaaaatac agatcttgtg gagtggtcct tccttcctca cagtacaatt 6720 

aagactttta cattgtactt ggatcaaatg gctacattaa ttggtcaggg aagattatga 6780 

ataataacat tgtgtggaaa tgacccagat aaaatcactg ttcctttcaa caagcaacag 6840 

gttagacaag cctttatcaa ttctggtgca tggcagattg gtcttgccga ttttgtggga 6900 

attattgaca atcgttaccc caaaacaaaa atcttccagt ttttaaaatt gactacttgg 6960 

attttaccta aagttaccaa acataagcct ttaaaaaatg ctctggcagt gtttactgat 7020 

ggttccagca atggaaaagt ggcttacacc gggccaaaag aatgagtcat caaaactcag 7080 

tatcacttga ctcaaagagc agagttggtt gccgtcatta cagtgttaac aagattttaa 7140 

tcagtctatt aacattgtat cagattctgc atatgtagta caggctacaa aggatattga 7200 

gagagcccta atcaaataca ttatggatga tcagttaaac ccgctgttta atttgttaca 7260 

acaaaatgta agaaaaagaa atttcccatt ttatattact catattcgag cacacactaa 7320 

tttaccaggg cctttaacta aagcaaatga acaagctgac ttgctagtat catctgcatt 7380 
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catggaagca caagaacttc atgccttgac 
atttgatatc acatggaaac agacaaaaaa 
tctacacctg gccactcagg aggcaagagt 
atggcaaatg gatgtcatgc acgtaccttc 
agttgatact tattcacatt tcatatgggc 
tgttaaaaga catttattat cttgttttcc 
agacaatggg ccaggttact gtagtaaagc 
tacacataca ataggaattc tctataattc 
tagaacactc aaagctcaat tggttaaaca 
ccccagatgc aacttaatct agcactctat 
cagaccacta cctctgcaga acaacatctt 
aaactgattt ggtggaaaga taataaaaat 
tgggggagag gttttgcttg tgtttcacca 
actagacatt taaagttcta caatgaactc 
gagacacccc aatcgactcg ccaggtaaac 
ttgccttcca tcaaggaagc agagttgcca 
ttagctaaaa aaaaaagcct agagaataca 
cttgcagctc tgatgattgt atcaacggtg 
gcagctaatt atacttactg ggcctatgtg 
tagatggata atcctattga agtagatgtt 
gatgactgtt gccctgccca acctgaagaa 
ccttatcctc ctgtttgcct agggaaggca 
tggttggtag aagtacctac agtcagtgct 
ggaatgtcac agataaataa tttacaggac 
cctaagggga aggcttgccc caaggaaatt 
gtctgcggag aatgtgtggc tgatactgca 
tgatagactg ggtcccttga ggccaattat 
gttcacaggc cccatccatc tggcccatta 
ggctggacca ggtttataga aggttagaat 
gaatttcatc accttgacca aagttagtcc 
aagcttactg tggcctcaca ccacattaga 
agagatcgta agtcatatta tactatcaac 
aattgtgtaa aactccctta tattgctagt 
tcccaaacca taatctgtga aaattgtgga 
tggcagcacc gtattctact aggaagagca 
gaccgaccat gggaggcttc gctatccatc 
ctaactagat ccaaaagatt catttttact 
gtcacagcta ctgctgcggc tgctggaatt 
tacgtaaatg attggcaaaa gaattcctca 
caaaaattgg caaaccaaat taatgatctt 
catgagcttg gaatatcttt ttcagttacg 
tacaccacaa gcctataatg agtctgagca 
aggaggagaa gataatctta ctttagacat 
gagacagagt ctcgctctgt cgcccaggct 
caagttccgc ctcctgggtt tacaccattc 
tacaggagcc caccaccatg cctggctaat 
tcaccgtgtt agccaggatg gtctcgatct 
cccaaagtgc tgggattaca gtcgtgagcc 
gcatcaaaag cccatttaaa tttggtgcca 
agcctcacaa atcttaagcc agtcacttgg 
aatttcatat taatccttgt atgcctgttc 
cagctccaaa gagacagcaa ccagcaagaa 
aaaagaaaag ggggggatat gtaaggaaaa 
gaaaaggaag acataagaaa ctccattttg 
agatgctgtt aatctgtaac tttagcccca 
ggtttaaggg atctagggct gtgcaggatg 
atgtttggta aaagtcatcg ccattctcca 
tggaaagcca caggaacctc tgcccaagaa 
cgaatggagg gaccagctgg tgctgcatca 
atttatcagt ttccaaaatt aatactttta 
cttaatcctg ttatctttgt aagctgagga 
aattgattgt aaaacatgtt cacatgtgtt 
atgaacagaa taacagtgat tttagggaac 
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tcatgtaaat gcaataggat taaaaaataa 
tattgtacaa cattgcaccc agtgtcagat 
taatcccaga ggtctatgtc ctaatgtgtt 
atttggaaaa ttgtcatttg tccatgtgac 
aacctgccag acaggagaaa gtacttccca 
tgtcatggga gttccagaaa aagttaaaac 
agttcaaaaa ttcttaaatc agtggaaaat 
ccaaggacag gccataattg aaagaactaa 
aaaaaaagga aaagacagga gtataacact 
actttaaatg ttttaaacat ttatagaaat 
actggtaaaa ggaacagccc acatgaagga 
aaaacatggg aaatggggaa ggtgataacg 
ggagaaaatc agcttcctgt ttggataccc 
actggagatg caaagaaaag tgtggagatg 
aaaatggtga tatcagaaga acagaaaaag 
atataggcac aattaaagaa gctgacacag 
aaggtgacac caactccaga gaatatgctg 
gtaagtcttc ccaagtctgc aggagcagct 
cctttcccac ccttaattcg ggcagttaca 
aataatagtg catgggtgcc tggccccaca 
ggaatgatga tgaatatttc cattgggtat 
ccaggatgct taatgcctac aacccaaaat 
accagtagat ttacttatca catggtaagt 
ccttcttatc aaagatcatt acaatgtagg 
cccaaagaat caaaaagccc agaagtctta 
gtgtagtaca aaacaatgaa ttttgaacta 
atcataactg tacaggccag actcattcat 
atccagccta tgacggtgat gtaactgaaa 
cactctgtcc aaggaaatgg ggtgaaaagg 
tgttactggt cctgaacatc cagaattagg 
atttgttctg gaaatcaagc tataggaaca 
ctaaattcca gtctgacaat tcctttgcaa 
tgtaggaaaa acatagttat taaacctgat 
atgtttactt gcattgattt gacttttaat 
agagagggtg tgtggatcct tgtgtccatg 
catattttaa cggaagtatt aaaaggaatt 
ttgatggcag tgattatggg cctcattgca 
gctttacact cctctgttca aactgcagaa 
aaattgtgga attctcagat ccaaatagat 
agacaaactg tcatttggat gggagaggct 
atgtgactgg aatacatcag atttttgtgt 
tcactgggac atggttagat gccatctgca 
ttcaaaatta aaagaatttt ttttttcttt 
ggagtgcagt ggcgtgatct cagctcactg 
tcctgcctca gcctcccaag tagttgggac 
tttttttggg tttttaatag agatggagtt 
cctgaccttg tgatctgccc accttggcct 
accgtgccca gccaagaaaa aatttttgag 
ggaacggaga caatcgtgaa agctgctgat 
gttaaaagca tcagaagttt cactattgta 
tgtctgttgt tagtctacag gtgtatccag 
tgggccatag tgacgatggt ggttttgtca 
gagagatcag actttcactg tgtctatgta 
atctgtacta agaaaaattg ttttgccttg 
accctgtgct cacggaaaca tgtgctgtaa 
taccttgtta acaatatgtt tgcaggcagt 
ttctcgatta accaggggct caatgcactg 
agcctggctg ttgtgggaag tcagggaccc 
ggaaacataa attgtgaaga tttcttggac 
taatttctta cacctgtctt actttaatct 
tatacgtcac ctcaggacca ctattgtaca 
tgaacaatat gaaatcagtg caccttgaaa 
aaaggaagac aaccataagg tctgactgcc 
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tgaggggtcg ggcaaaaagc catatttttc ttcttgcaga gagcctataa atggacgtgc 11220 

aagtaggaga gatattgcta aattcttttc ctagcaagga atataatact aagaccctag 11280 

ggaaagaatt gcattcctgg ggggaggtct ataaacggcc gctctgggag tgtctgtcct 11340 

atgtggttga gataaggact gagatacgcc ctggtctcct gcagtaccct caggcttact 11400 

aggattggga aaccccagtc ctggtaaatt tgaggtcagg ccggttcttt gctctgaacc 11460 

ctgttttctg ttaagatgtt tatcaagaca atacatgcac cgctgaacat agacccttat 11520 

caggagtttc tgattttgct ctggtcctgt ttcttcagaa gcatgtcatc tttgctctgc 11580 

cttctgccct ttgaagcatg tgatctttgt gacctactcc ctgttcatac acccctcccc 11640 

ttttaaaatc cctaataaaa acttgctggt tttgtggctc aggggggcat catggaccta 11700 

ccaatacgtg atgtcacccc cggtggccca gctgtaaaat tcctttcttt atactcttat 11760 

ttctcagacc agctgacact tagggaaaat agaaagaacc tatgttgaaa tattggaggc 11820 

gggttccccc gatacctggg tattgtccaa ggtttccttt gctgaggagg attagtaaaa 11880 

ggaatgcctc catctcctgc atgtccctgg gaacagaatg ttcccaccaa ccaccctgtg 11940 

gctggaggcg ggatatgctg gcagcaatgc tgctctatta ctctttgcta cactgagatg 12000 

tttgggtgga gagaagcata aatctggcct atgtgcacat ctgggcacag caccttcctt 12060 

tgaacttatt tgtgacacag attcctttgc tcacgttttc ctgttgactt tctcaccact 12120 

caccctattc tcctgtggca ttcgccttgc ggagatagtg aaaatagtaa taaatactga 12180 

gggaactcag actgagggaa ctcagactgg gcagaccggg gccagtgtgg gtcctccata 12240 

tgctgagcgc cggttccctg ggcccactgt tctttctcta tactttgtct ctgtgcctta 12300 

ttttctcagt ctctcattcc acctgatgag aaatacccac aggtgtggag gggctggccc 12360 

ccttca 12366 

<210> 76 
<211> 2148 
<212> DNA 

<213> Human endogenous retrovirus, K family (HERV-K) 
<400> 76 

atggggcaaa ctgaaagtaa atatgcctct tatctcagct ttattaaaat tcttttaaga 60 

agagggggag ttagagcttc tacagaaaat ctaattacgc tatttcaaac aatagaacaa 120 

ttctgcccat ggtttccaga acagggaact ttagatctaa aagattggga aaaaattggc 180 

aaagaattaa aacaagcaaa tagggaaggt aaaatcatcc cacttacagt atggaatgat 240 

tgggccatta ttaaagcaac tttagaacca tttcaaacag gagaagatat tgtttcagtt 300 

tctgatgccc ctaaaagctg tgtaacagat tgtgaagaag aggcagggac agaatcccag 360 

caaggaacgg aaagttcaca ttgtaaatat gtagcagagt ctgtaatggc tcagtcaacg 420 

caaaatgttg actacagtca attacaggag ataatatacc ctgaatcatc aaaattgggg 480 

gaaggaggtc cagaatcatt ggggccatca gagcctaaac cacgatcgcc atcaactcct 540 

cctcccgtgg ttcagatgcc tgtaacatta caacctcaaa cgcaggttag acaagcacaa 600 

accccaagag aaaatcaagt agaaagggac agagtctcta tcccggcaat gccaactcag 660 

atacagtatc cacaatatca gccggtagaa aataagaccc aaccgctggt agtttatcaa 720 

taccggctgc caaccgagct tcagtatcgg cctccttcag aggttcaata cagacctcaa 780 

gcggtgtgtc ctgtgccaaa tagcacggca ccataccagc aacccacagc gatggcgtct 840 

aattcaccag caacacagga cgcggcgctg tatcctcagc cgcccactgt gagacttaat 900 

cctacagcat cacgtagtgg acagggtggt gcactgcatg cagtcattga tgaagccaga 960 

aaacagggcg atcttgaggc atggcggttc ctggtaattt tacaactggt acaggccggg 1020 

gaagagactc aagtaggagc gcctgcccga gctgagacta gatgtgaacc tttcaccatg 1080 

aaaatgttaa aagatataaa ggaaggagtt aaacaatatg gatccaactc cccttatata 1140 

agaacattat tagattccat tgctcatgga aatagactta ctccttatga ctgggaaatt 1200 

ttggccaaat cttccctttc atcctctcag tatctacagt ttaaaacctg gtggattgat 1260 

ggagtacaag aacaggtacg aaaaaatcag gctactaagc ccactgttaa tatagacgca 1320 

gaccaattgt taggaacagg tccaaattgg agcaccatta accaacaatc agtgatgcag 1380 

aatgaggcta ttgaacaagt aagggctatt tgcctcaggg cctggggaaa aattcaggac 1440 

ccaggaacag ctttccctat taattcaatt agacaaggct ctaaagagcc atatcctgac 1500 

tttgtggcaa gattacaaga tgctgctcaa aagtctatta cagatgacaa tgcccgaaaa 1560 

gttattgtag aattaatggc ctatgaaaat gcaaatccag aatgtcagtc ggccataaag 1620 

ccattaaaag gaaaagttcc agcaggagtt gatgtaatta cagaatatgt gaaggcttgt 1680 

gatgggattg gaggagctat gcataaggca atgctaatgg ctcaagcaat gagggggctc 1740 

actctaggag gacaagttag aacatttggg aaaaaatgtt ataattgtgg tcaaatcggt 1800 

catctgaaaa ggagttgccc agtcttaaat aaacagaata taataaatca agctattaca 1860 

gcaaaaaata aaaagccatc tggcctgtgt ccaaaatgtg gaaaaggaaa acattgggcc 1920 

aatcaatgtc attctaaatt tgataaggat gggcaaccat tgtcgggaaa caggaagagg 1980 

ggccagcctc aggcccccca acaaactggg gcattcccag ttcaactgtt tgttcctcag 2040 

ggttttcaag gacaacaacc cctacagaaa ataccaccac ttcagggagt cagccaatta 2100 
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caacaatcca acagctgtcc cgcgccacag caggcagcac cgcagtaa 

<210> 77 
<211> 2151 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> Manipulated Gag 
<400> 77 

atgggccaga ccgagagcaa gtacgccagc tacctgagct tcatcaagat cctgctgcgc 
cgcggcggcg tgcgcgccag caccgagaac ctgatcaccc tgttccagac catcgagcag 
ttctgcccct ggttccccga gcagggcacc ctggacctga aggactggga gaagatcggc 
aaggagctga agcaggccaa ccgcgagggc aagatcatcc ccctgaccgt gtggaacgac 
tgggccatca tcaaggccac cctggagccc ttccagaccg gcgaggacat cgtgagcgtg 
agcgacgccc ccaagagctg cgtgaccgac tgcgaggagg aggccggcac cgagagccag 
cagggcaccg agagcagcca ctgcaagtac gtggccgaga gcgtgatggc ccagagcacc 
cagaacgtgg actacagcca gctgcaggag atcatctacc ccgagagcag caagctgggc 
gagggcggcc ccgagagcct gggccccagc gagcccaagc cccgcagccc cagcaccccc 
ccccccgtgg tgcagatgcc cgtgaccctg cagccccaga cccaggtgcg ccaggcccag 
accccccgcg agaaccaggt ggagcgcgac cgcgtgagca tccccgccat gcccacccag 
atccagtacc cccagtacca gcccgtggag aacaagaccc agcccctggt ggtgtaccag 
taccgcctgc ccaccgagct gcagtaccgc ccccccagcg aggtgcagta ccgcccccag 
gccgtgtgcc ccgtgcccaa cagcaccgcc ccctaccagc agcccaccgc catggccagc 
aacagccccg ccacccagga cgccgccctg tacccccagc cccccaccgt gcgcctgaac 
cccaccgcca gccgcagcgg ccagggcggc gccctgcacg ccgtgatcga cgaggcccgc 
aagcagggcg acctggaggc ctggcgcttc ctggtgatcc tgcagctggt gcaggccggc 
gaggagaccc aggtgggcgc ccccgcccgc gccgagaccc gctgcgagcc cttcaccatg 
aagatgctga aggacatcaa ggagggcgtg aagcagtacg gcagcaacag cccctacatc 
cgcaccctgc tggacagcat cgcccacggc aaccgcctga ccccctacga ctgggagatc 
ctggccaaga gcagcctgag cagcagccag tacctgcagt tcaagacctg gtggatcgac 
ggcgtgcagg agcaggtgcg caagaaccag gccaccaagc ccaccgtgaa catcgacgcc 
gaccagctgc tgggcaccgg ccccaactgg agcaccatca accagcagag cgtgatgcag 
aacgaggcca tcgagcaggt gcgcgccatc tgcctgcgcg cctggggcaa gatccaggac 
cccggcaccg ccttccccat caacagcatc cgccagggca gcaaggagcc ctaccccgac 
ttcgtggccc gcctgcagga cgccgcccag aagagcatca ccgacgacaa cgcccgcaag 
gtgatcgtgg agctgatggc ctacgagaac gccaaccccg agtgccagag cgccatcaag 
cccctgaagg gcaaggtgcc cgccggcgtg gacgtgatca ccgagtacgt gaaggcctgc 
gacggcatcg gcggcgccat gcacaaggcc atgctgatgg cccaggccat gcgcggcctg 
accctgggcg gccaggtgcg caccttcggc aagaagtgct acaactgcgg ccagatcggc 
cacctgaagc gcagctgccc cgtgctgaac aagcagaaca tcatcaacca ggccatcacc 
gccaagaaca agaagcccag cggcctgtgc cccaagtgcg gcaagggcaa gcactgggcc 
aaccagtgcc acagcaagtt cgacaaggac ggccagcccc tgagcggcaa ccgcaagcgc 
ggccagcccc aggcccccca gcagaccggc gccttccccg tgcagctgtt cgtgccccag 
ggcttccagg gccagcagcc cctgcagaag atcccccccc tgcagggcgt gagccagctg 
cagcagagca acagctgccc cgccccccag caggccgccc cccaggctta a 

<210> 78 
<211> 715 
<212> PRT 

<213> Human endogenous retrovirus, K family (HERV-K) 
<400> 78 

Met Gly Gin Thr Glu Ser Lys Tyr Ala ser Tyr Leu ser Phe lie Lys 
15 10 15 

lie Leu Leu Arg Arg Gly Gly Val Arg Ala ser Thr Glu Asn Leu lie 

20 25 30 

Thr Leu Phe Gin Thr lie Glu Gin Phe Cys Pro Trp Phe Pro Glu Gin 
35 40 45 
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Gly Thr Leu Asp Leu Lys Asp Trp Glu Lys lie Gly Lys Glu Leu Lys 
50 55 60 

Gin Ala Asn Arg Glu Gly Lys lie lie Pro Leu Thr val Trp Asn Asp 
65 70 75 80 

Trp Ala lie lie Lys Ala Thr Leu Glu Pro Phe Gin Thr Gly Glu Asp 
85 90 95 

lie val Ser val Ser Asp Ala Pro Lys Ser Cys val Thr Asp Cys Glu 
100 105 110 

Glu Glu Ala Gly Thr Glu Ser Gin Gin Gly Thr Glu Ser Ser His Cys 
115 120 125 

Lys Tyr Val Ala Glu Ser val Met Ala Gin Ser Thr Gin Asn val Asp 
130 135 140 

Tyr Ser Gin Leu Gin Glu lie lie Tyr Pro Glu Ser Ser Lys Leu Gly 
145 150 155 160 

Glu Gly Gly Pro Glu Ser Leu Gly Pro Ser Glu Pro Lys Pro Arg ser 
165 170 175 

Pro Ser Thr Pro Pro Pro val val Gin Met Pro val Thr Leu Gin Pro 
180 185 190 

Gin Thr Gin val Arg Gin Ala Gin Thr Pro Arg Glu Asn Gin val Glu 
195 200 205 

Arg Asp Arg Val Ser lie Pro Ala Met Pro Thr Gin lie Gin Tyr Pro 

210 215 220 

Gin Tyr Gin Pro val Glu Asn Lys Thr Gin Pro Leu val val Tyr Gin 
225 230 235 240 

Tyr Arg Leu Pro Thr Glu Leu Gin Tyr Arg Pro Pro Ser Glu Val Gin 
245 250 255 

Tyr Arg Pro Gin Ala val Cys Pro val Pro Asn ser Thr Ala Pro Tyr 

260 265 270 

Gin Gin Pro Thr Ala Met Ala Ser Asn Ser Pro Ala Thr Gin Asp Ala 
275 280 285 

Ala Leu Tyr Pro Gin Pro Pro Thr val Arg Leu Asn Pro Thr Ala Ser 
290 295 300 

Arg Ser Gly Gin Gly Gly Ala Leu His Ala val lie Asp Glu Ala Arg 

305 310 315 320 

Lys Gin Gly Asp Leu Glu Ala Trp Arg Phe Leu val lie Leu Gin Leu 
325 330 335 

val Gin Ala Gly Glu Glu Thr Gin val Gly Ala Pro Ala Arg Ala Glu 
340 345 350 

Thr Arg cys Glu Pro Phe Thr Met Lys Met Leu Lys Asp lie Lys Glu 

355 360 365 

Gly val Lys Gin Tyr Gly ser Asn ser Pro Tyr lie Arg Thr Leu Leu 
370 375 380 

page 83 



substitute sequence Listing_USSN 10587032_PP019482.007 
Asp Sen lie Ala His Gly Asn Arg Leu Thr Pro Tyr Asp Trp Glu lie 
385 390 395 400 

Leu Ala Lys Ser Ser Leu ser Ser Ser Gin Tyr Leu Gin Phe Lys Thr 
405 410 415 

Trp Trp lie Asp Gly val Gin Glu Gin val Arg Lys Asn Gin Ala Thr 
420 425 430 

Lys Pro Thr val Asn lie Asp Ala Asp Gin Leu Leu Gly Thr Gly Pro 
435 440 445 

Asn Trp ser Thr lie Asn Gin Gin Ser Val Met Gin Asn Glu Ala lie 
450 455 460 

Glu Gin val Arg Ala lie Cys Leu Arg Ala Trp Gly Lys lie Gin Asp 
465 470 475 480 

Pro Gly Thr Ala Phe Pro lie Asn ser lie Arg Gin Gly Ser Lys Glu 
485 490 495 

Pro Tyr Pro Asp Phe Val Ala Arg Leu Gin Asp Ala Ala Gin Lys Ser 
500 505 510 

lie Thr Asp Asp Asn Ala Arg Lys Val He val Glu Leu Met Ala Tyr 
515 520 525 

Glu Asn Ala Asn Pro Glu Cys Gin Ser Ala lie Lys Pro Leu Lys Gly 
530 535 540 

Lys val Pro Ala Gly val Asp Val lie Thr Glu Tyr val Lys Ala Cys 
545 550 555 560 

Asp Gly lie Gly Gly Ala Met His Lys Ala Met Leu Met Ala Gin Ala 
565 570 575 

Met Arg Gly Leu Thr Leu Gly Gly Gin val Arg Thr Phe Gly Lys Lys 
580 585 590 

Cys Tyr Asn Cys Gly Gin lie Gly His Leu Lys Arg Ser Cys Pro val 
595 600 605 

Leu Asn Lys Gin Asn lie lie Asn Gin Ala lie Thr Ala Lys Asn Lys 
610 615 620 

Lys Pro Ser Gly Leu Cys Pro Lys Cys Gly Lys Gly Lys His Trp Ala 
625 630 635 640 

Asn Gin Cys His Ser Lys Phe Asp Lys Asp Gly Gin Pro Leu ser Gly 
645 650 655 

Asn Arg Lys Arg Gly Gin Pro Gin Ala Pro Gin Gin Thr Gly Ala Phe 
660 665 670 

Pro Val Gin Leu Phe val Pro Gin Gly Phe Gin Gly Gin Gin Pro Leu 
675 680 685 

Gin Lys lie Pro Pro Leu Gin Gly val Ser Gin Leu Gin Gin ser Asn 
690 695 700 

Ser cys Pro Ala Pro Gin Gin Ala Ala Pro Gin 
705 710 715 
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<210> 79 
<211> 716 
<212> PRT 

<213> Artificial Sequence 

<220> 

<223> Manipulated Gag 
<400> 79 

Met Gly Gin Thr Glu Ser Lys Tyr Ala Ser Tyr Leu Ser Phe lie Lys 
15 10 15 

lie Leu Leu Arg Arg Gly Gly Val Arg Ala Ser Thr Glu Asn Leu lie. 
20 25 30 

Thr Leu Phe Gin Thr lie Glu Gin Phe Cys Pro Trp Phe Pro Glu Gin 
35 40 45 

Gly Thr Leu Asp Leu Lys Asp Trp Glu Lys lie Gly Lys Glu Leu Lys 
50 55 60 

Gin Ala Asn Arg Glu Gly Lys lie lie Pro Leu Thr val Trp Asn Asp 
65 70 75 80 

Trp Ala lie lie Lys Ala Thr Leu Glu Pro Phe Gin Thr Gly Glu Asp 
85 90 95 

lie val Ser val ser Asp Ala Pro Lys Ser cys val Thr Asp cys Glu 
100 105 110 

Glu Glu Ala Gly Thr Glu Ser Gin Gin Gly Thr Glu Ser Ser His Cys 

115 120 125 

Lys Tyr Val Ala Glu Ser val Met Ala Gin ser Thr Gin Asn Val Asp 
130 135 140 

Tyr Ser Gin Leu Gin Glu lie lie Tyr Pro Glu Ser Ser Lys Leu Gly 
145 150 155 160 

Glu Gly Gly Pro Glu Ser Leu Gly Pro Ser Glu Pro Lys Pro Arg Ser 
165 170 175 

Pro Ser Thr Pro Pro Pro val val Gin Met Pro val Thr Leu Gin Pro 
180 185 190 

Gin Thr Gin Val Arg Gin Ala Gin Thr Pro Arg Glu Asn Gin val Glu 
195 200 205 

Arg Asp Arg Val Ser lie Pro Ala Met Pro Thr Gin He Gin Tyr Pro 

210 215 220 

Gin Tyr Gin Pro val Glu Asn Lys Thr Gin Pro Leu val Val Tyr Gin 
225 230 235 240 

Tyr Arg Leu Pro Thr Glu Leu Gin Tyr Arg Pro Pro Ser Glu Val Gin 
245 250 255 

Tyr Arg Pro Gin Ala val Cys Pro val Pro Asn Ser Thr Ala Pro Tyr 

260 265 270 

Gin Gin Pro Thr Ala Met Ala Ser Asn ser Pro Ala Thr Gin Asp Ala 
275 280 285 
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Ala Leu Tyr Pro Gin Pro Pro Thr val Arg Leu Asn Pro Thr Ala Sen 
290 295 300 

Arg Ser Gly Gin Gly Gly Ala Leu His Ala val lie Asp Glu Ala Arg 
305 310 315 320 

Lys Gin Gly Asp Leu Glu Ala Trp Arg Phe Leu Val He Leu Gin Leu 
325 330 335 

val Gin Ala Gly Glu Glu Thr Gin val Gly Ala Pro Ala Arg Ala Glu 
340 345 350 

Thr Arg Cys Glu Pro Phe Thr Met Lys Met Leu Lys Asp lie Lys Glu 
355 360 365 

Gly val Lys Gin Tyr Gly Ser Asn Ser Pro Tyr lie Arg Thr Leu Leu 
370 375 380 

Asp Ser lie Ala His Gly Asn Arg Leu Thr Pro Tyr Asp Trp Glu lie 
385 390 395 400 

Leu Ala Lys Ser Ser Leu Ser Ser Ser Gin Tyr Leu Gin Phe Lys Thr 
405 410 415 

Trp Trp lie Asp Gly Val Gin Glu Gin val Arg Lys Asn Gin Ala Thr 
420 425 430 

Lys Pro Thr Val Asn lie Asp Ala Asp Gin Leu Leu Gly Thr Gly Pro 
435 440. 445 

Asn Trp ser Thr He Asn Gin Gin Ser Val Met Gin Asn Glu Ala lie 
450 455 460 

Glu Gin val Arg Ala lie Cys Leu Arg Ala Trp Gly Lys lie Gin Asp 
465 470 475 480 

Pro Gly Thr Ala Phe Pro lie Asn ser lie Arg Gin Gly ser Lys Glu 
485 490 495 

Pro Tyr Pro Asp Phe Val Ala Arg Leu Gin Asp Ala Ala Gin Lys Ser 

500 505 510 

lie Thr Asp Asp Asn Ala Arg Lys val lie val Glu Leu Met Ala Tyr 
515 520 525 

Glu Asn Ala Asn Pro Glu cys Gin Ser Ala lie Lys Pro Leu Lys Gly 
530 535 540 

Lys val Pro Ala Gly val Asp val lie Thr Glu Tyr Val Lys Ala cys 

545 550 555 560 

Asp Gly lie Gly Gly Ala Met His Lys Ala Met Leu Met Ala Gin Ala 
565 570 575 

Met Arg Gly Leu Thr Leu Gly Gly Gin val Arg Thr Phe Gly Lys Lys 
580 585 590 

Cys Tyr Asn cys Gly Gin lie Gly His Leu Lys Arg ser cys Pro val 

595 600 605 

Leu Asn Lys Gin Asn lie lie Asn Gin Ala lie Thr Ala Lys Asn Lys 
610 615 620 
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Lys Pro Ser Gly Leu Cys Pro Lys Cys Gly Lys Gly Lys His Trp Ala 
625 630 635 640 

Asn Gin Cys His Ser Lys Phe Asp Lys Asp Gly Gin Pro Leu Ser Gly 
645 650 655 

Asn Arg Lys Arg Gly Gin Pro Gin Ala Pro Gin Gin Thr Gly Ala Phe 
660 665 670 

pro val Gin Leu Phe Val Pro Gin Gly Phe Gin Gly Gin Gin Pro Leu 
675 680 685 

Gin Lys lie Pro Pro Leu Gin Gly Val Ser Gin Leu Gin Gin Ser Asn 
690 695 700 

Ser Cys Pro Ala Pro Gin Gin Ala Ala Pro Gin Ala 
705 710 715 

<210> 80 
<211> 6486 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> pCMVKm2 . gagopt PCAV vector 
<400> 80 

gccgcggaat ttcgactcta ggccattgca tacgttgtat ctatatcata atatgtacat 60 

ttatattggc tcatgtccaa tatgaccgcc atgttgacat tgattattga ctagttatta 120 

atagtaatca attacggggt cattagttca tagcccatat atggagttcc gcgttacata 180 

acttacggta aatggcccgc ctggctgacc gcccaacgac ccccgcccat tgacgtcaat 240 

aatgacgtat gttcccatag taacgccaat agggactttc cattgacgtc aatgggtgga 300 

gtatttacgg taaactgccc acttggcagt acatcaagtg tatcatatgc caagtccgcc 360 

ccctattgac gtcaatgacg gtaaatggcc cgcctggcat tatgcccagt acatgacctt 420 

acgggacttt cctacttggc agtacatcta cgtattagtc atcgctatta ccatggtgat 480 

gcggttttgg cagtacacca atgggcgtgg atagcggttt gactcacggg gatttccaag 540 

tctccacccc attgacgtca atgggagttt gttttggcac caaaatcaac gggactttcc 600 

aaaatgtcgt aataaccccg ccccgttgac gcaaatgggc ggtaggcgtg tacggtggga 660 

ggtctatata agcagagctc gtttagtgaa ccgtcagatc gcctggagac gccatccacg 720 

ctgttttgac ctccatagaa gacaccggga ccgatccagc ctccgcggcc gggaacggtg 780 

cattggaacg cggattcccc gtgccaagag tgacgtaagt accgcctata gactctatag 840 

gcacacccct ttggctctta tgcatgctat actgtttttg gcttggggcc tatacacccc 900 

cgcttcctta tgctataggt gatggtatag cttagcctat aggtgtgggt tattgaccat 960 

tattgaccac tcccctattg gtgacgatac tttccattac taatccataa catggctctt 1020 

tgccacaact atctctattg gctatatgcc aatactctgt ccttcagaga ctgacacgga 1080 

ctctgtattt ttacaggatg gggtcccatt tattatttac aaattcacat atacaacaac 1140 

gccgtccccc gtgcccgcag tttttattaa acatagcgtg ggatctccac gcgaatctcg 1200 

ggtacgtgtt ccggacatgg gctcttctcc ggtagcggcg gagcttccac atccgagccc 1260 

tggtcccatg cctccagcgg ctcatggtcg ctcggcagct ccttgctcct aacagtggag 1320 

gccagactta ggcacagcac aatgcccacc accaccagtg tgccgcacaa ggccgtggcg 1380 

gtagggtatg tgtctgaaaa tgagctcgga gattgggctc gcaccgctga cgcagatgga 1440 

agacttaagg cagcggcaga agaagatgca ggcagctgag ttgttgtatt ctgataagag 1500 

tcagaggtaa ctcccgttgc ggtgctgtta acggtggagg gcagtgtagt ctgagcagta 1560 

ctcgttgctg ccgcgcgcgc caccagacat aatagctgac agactaacag actgttcctt 1620 

tccatgggtc ttttctgcag tcaccgtcgt cgacgccacc atgggccaga ccgagagcaa 1680 

gtacgccagc tacctgagct tcatcaagat cctgctgcgc cgcggcggcg tgcgcgccag 1740 

caccgagaac ctgatcaccc tgttccagac catcgagcag ttctgcccct ggttccccga 1800 

gcagggcacc ctggacctga aggactggga gaagatcggc aaggagctga agcaggccaa 1860 

ccgcgagggc aagatcatcc ccctgaccgt gtggaacgac tgggccatca tcaaggccac 1920 

cctggagccc ttccagaccg gcgaggacat cgtgagcgtg agcgacgccc ccaagagctg 1980 

cgtgaccgac tgcgaggagg aggccggcac cgagagccag cagggcaccg agagcagcca 2040 

ctgcaagtac gtggccgaga gcgtgatggc ccagagcacc cagaacgtgg actacagcca 2100 

gctgcaggag atcatctacc ccgagagcag caagctgggc gagggcggcc ccgagagcct 2160 
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gggccccagc gagcccaagc cccgcagccc cagcaccccc ccccccgtgg tgcagatgcc 2220 

cgtgaccctg cagccccaga cccaggtgcg ccaggcccag accccccgcg agaaccaggt 2280 

ggagcgcgac cgcgtgagca tccccgccat gcccacccag atccagtacc cccagtacca 2340 

gcccgtggag aacaagaccc agcccctggt ggtgtaccag taccgcctgc ccaccgagct 2400 

gcagtaccgc ccccccagcg aggtgcagta ccgcccccag gccgtgtgcc ccgtgcccaa 2460 

cagcaccgcc ccctaccagc agcccaccgc catggccagc aacagccccg ccacccagga 2520 

cgccgccctg tacccccagc cccccaccgt gcgcctgaac cccaccgcca gccgcagcgg 2580 

ccagggcggc gccctgcacg ccgtgatcga cgaggcccgc aagcagggcg acctggaggc 2640 

ctggcgcttc ctggtgatcc tgcagctggt gcaggccggc gaggagaccc aggtgggcgc 2700 

ccccgcccgc gccgagaccc gctgcgagcc cttcaccatg aagatgctga aggacatcaa 2760 

ggagggcgtg aagcagtacg gcagcaacag cccctacatc cgcaccctgc tggacagcat 2820 

cgcccacggc aaccgcctga ccccctacga ctgggagatc ctggccaaga gcagcctgag 2880 

cagcagccag tacctgcagt tcaagacctg gtggatcgac ggcgtgcagg agcaggtgcg 2940 

caagaaccag gccaccaagc ccaccgtgaa catcgacgcc gaccagctgc tgggcaccgg 3000 

ccccaactgg agcaccatca accagcagag cgtgatgcag aacgaggcca tcgagcaggt 3060 

gcgcgccatc tgcctgcgcg cctggggcaa gatccaggac cccggcaccg ccttccccat 3120 

caacagcatc cgccagggca gcaaggagcc ctaccccgac ttcgtggccc gcctgcagga 3180 

cgccgcccag aagagcatca ccgacgacaa cgcccgcaag gtgatcgtgg agctgatggc 3240 

ctacgagaac gccaaccccg agtgccagag cgccatcaag cccctgaagg gcaaggtgcc 3300 

cgccggcgtg gacgtgatca ccgagtacgt gaaggcctgc gacggcatcg gcggcgccat 3360 

gcacaaggcc atgctgatgg cccaggccat gcgcggcctg accctgggcg gccaggtgcg 3420 

caccttcggc aagaagtgct acaactgcgg ccagatcggc cacctgaagc gcagctgccc 3480 

cgtgctgaac aagcagaaca tcatcaacca ggccatcacc gccaagaaca agaagcccag 3540 

cggcctgtgc cccaagtgcg gcaagggcaa gcactgggcc aaccagtgcc acagcaagtt 3600 

cgacaaggac ggccagcccc tgagcggcaa ccgcaagcgc ggccagcccc aggcccccca 3660 

gcagaccggc gccttccccg tgcagctgtt cgtgccccag ggcttccagg gccagcagcc 3720 

cctgcagaag atcccccccc tgcagggcgt gagccagctg cagcagagca acagctgccc 3780 

cgccccccag caggccgccc cccaggctta agaattcaga ctcgagcaag tctagaaagc 3840 

catggatatc ggatccacta cgcgttagag ctcgctgatc agcctcgact gtgccttcta 3900 

gttgccagcc atctgttgtt tgcccctccc ccgtgccttc cttgaccctg gaaggtgcca 3960 

ctcccactgt cctttcctaa taaaatgagg aaattgcatc gcattgtctg agtaggtgtc 4020 

attctattct ggggggtggg gtggggcagg acagcaaggg ggaggattgg gaagacaata 4080 

gcaggggggt gggcgaagaa ctccagcatg agatccccgc gctggaggat catccagccg 4140 

gcgtcccgga aaacgattcc gaagcccaac ctttcataga aggcggcggt ggaatcgaaa 4200 

tctcgtgatg gcaggttggg cgtcgcttgg tcggtcattt cgaaccccag agtcccgctc 4260 

agaagaactc gtcaagaagg cgatagaagg cgatgcgctg cgaatcggga gcggcgatac 4320 

cgtaaagcac gaggaagcgg tcagcccatt cgccgccaag ctcttcagca atatcacggg 4380 

tagccaacgc tatgtcctga tagcggtccg ccacacccag ccggccacag tcgatgaatc 4440 

cagaaaagcg gccattttcc accatgatat tcggcaagca ggcatcgcca tgggtcacga 4500 

cgagatcctc gccgtcgggc atgcgcgcct tgagcctggc gaacagttcg gctggcgcga 4560 

gcccctgatg ctcttcgtcc agatcatcct gatcgacaag accggcttcc atccgagtac 4620 

gtgctcgctc gatgcgatgt ttcgcttggt ggtcgaatgg gcaggtagcc ggatcaagcg 4680 

tatgcagccg ccgcattgca tcagccatga tggatacttt ctcggcagga gcaaggtgag 4740 

atgacaggag atcctgcccc ggcacttcgc ccaatagcag ccagtccctt cccgcttcag 4800 

tgacaacgtc gagcacagct gcgcaaggaa cgcccgtcgt ggccagccac gatagccgcg 4860 

ctgcctcgtc ctgcagttca ttcagggcac cggacaggtc ggtcttgaca aaaagaaccg 4920 

ggcgcccctg cgctgacagc cggaacacgg cggcatcaga gcagccgatt gtctgttgtg 4980 

cccagtcata gccgaatagc ctctccaccc aagcggccgg agaacctgcg tgcaatccat 5040 

cttgttcaat catgcgaaac gatcctcatc ctgtctcttg atcagatctt gatcccctgc 5100 

gccatcagat ccttggcggc aagaaagcca tccagtttac tttgcagggc ttcccaacct 5160 

taccagaggg cgccccagct ggcaattccg gttcgcttgc tgtccataaa accgcccagt 5220 

ctagctatcg ccatgtaagc ccactgcaag ctacctgctt tctctttgcg cttgcgtttt 5280 

cccttgtcca gatagcccag tagctgacat tcatccgggg tcagcaccgt ttctgcggac 5340 

tggctttcta cgtgttccgc ttcctttagc agcccttgcg ccctgagtgc ttgcggcagc 5400 

gtgaagctaa ttcatggtta aatttttgtt aaatcagctc attttttaac caataggccg 5460 

aaatcggcaa aatcccttat aaatcaaaag aatagcccga gatagggttg agtgttgttc 5520 

cagtttggaa caagagtcca ctattaaaga acgtggactc caacgtcaaa gggcgaaaaa 5580 

ccgtctatca gggcgatggc cggatcagct tatgcggtgt gaaataccgc acagatgcgt 5640 

aaggagaaaa taccgcatca ggcgctcttc cgcttcctcg ctcactgact cgctgcgctc 5700 

ggtcgttcgg ctgcggcgag cggtatcagc tcactcaaag gcggtaatac ggttatccac 5760 

agaatcaggg gataacgcag gaaagaacat gtgagcaaaa ggccagcaaa aggccaggaa 5820 

ccgtaaaaag gccgcgttgc tggcgttttt ccataggctc cgcccccctg acgagcatca 5880 

caaaaatcga cgctcaagtc agaggtggcg aaacccgaca ggactataaa gataccaggc 5940 
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gtttccccct ggaagctccc tcgtgcgctc tcctgttccg accctgccgc ttaccggata 
cctgtccgcc tttctccctt cgggaagcgt ggcgctttct catagctcac gctgtaggta 
tctcagttcg gtgtaggtcg ttcgctccaa gctgggctgt gtgcacgaac cccccgttca 
gcccgaccgc tgcgccttat ccggtaacta tcgtcttgag tccaacccgg taagacacga 
cttatcgcca ctggcagcag ccactggtaa caggattagc agagcgaggt atgtaggcgg 
tgctacagag ttcttgaagt ggtggcctaa ctacggctac actagaagga cagtatttgg 
tatctgcgct ctgctgaagc cagttacctt cggaaaaaga gttggtagct cttgatccgg 
caaacaaacc accgctggta gcggtggttt ttttgtttgc aagcagcaga ttacgcgcag 
aaaaaaagga tctcaagaag atcctttgat cttttctact gaacggtgat ccccaccgga 
attgcg 

<210> 81 
<211> 2103 
<212> DNA 

<213> Human endogenous retrovirus, K family (herv-k) 
<400> 81 

atgaacccaa gcgagatgca aagaaaagca cctccgcgga gacggagaca tcgcaatcga 
gcaccgttga ctcacaagat gaacaaaatg gtgacgtcag aagaacagat gaagttgcca 
tccaccaaga aggcagagcc gccaacttgg gcacaactaa agaagctgac gcagttagct 
acaaaatatc tagagaacac aaaggtgaca caaaccccag agagtatgct gcttgcagcc 
ttgatgattg tatcaatggt ggtaagtctc cctatgcctg caggagcagc tgcagctaac 
tatacctact gggcctatgt gcctttcccg cccttaattc gggcagtcac atggatggat 
aatcctacag aagtatatgt taatgatagt gtatgggtac ctggccccat agatgatcgc 
tgccctgcca aacctgagga agaagggatg atgataaata tttccattgg gtatcattat 
cctcctattt gcctagggag agcaccagga tgtttaatgc ctgcagtcca aaattggttg 
gtagaagtac ctactgtcag tcccatctgt agattcactt atcacatggt aagcgggatg 
tcactcaggc cacgggtaaa ttatttacaa gacttttctt atcaaagatc attaaaattt 
agacctaaag ggaaaccttg ccccaaggaa attcccaaag aatcaaaaaa tacagaagtt 
ttagtttggg aagaatgtgt ggccaatagt gcggtgatat tacaaaacaa tgaattcgga 
actattatag attgggcacc tcgaggtcaa ttctaccaca attgctcagg acaaactcag 
tcgtgtccaa gtgcacaagt gagtccagct gttgatagcg acttaacaga aagtttagac 
aaacataagc ataaaaaatt gcagtctttc tacccttggg aatggggaga aaaaggaatc 
tctaccccaa gaccaaaaat agtaagtcct gtttctggtc ctgaacatcc agaattatgg 
aggcttactg tggcttcaca ccacattaga atttggtctg gaaatcaaac tttagaaaca 
agagatcgta agccatttta tactattgac ctgaattcca gtctaacagt tcctttacaa 
agttgcgtaa agccccctta tatgctagtt gtaggaaata tagttattaa accagactcc 
cagactataa cctgtgaaaa ttgtagattg cttacttgca ttgattcaac ttttaattgg 
caacaccgta ttctgctggt gagagcaaga gagggcgtgt ggatccctgt gtccatggac 
cgaccgtggg aggcctcgcc atccgtccat attttgactg aagtattaaa aggtgtttta 
aatagatcca aaagattcat ttttacttta attgcagtga ttatgggatt aattgcagtc 
acagctacgg ctgctgtagc aggagttgca ttgcactctt ctgttcagtc agtaaacttt 
gttaatgatt ggcaaaaaaa ttctacaaga ttgtggaatt cacaatctag tattgatcaa 
aaattggcaa atcaaattaa tgatcttaga caaactgtca tttggatggg agacagactc 
atgagcttag aacatcgttt ccagttacaa tgtgactgga atacgtcaga tttttgtatt 
acaccccaaa tttataatga gtctgagcat cactgggaca tggttagacg ccatctacag 
ggaagagaag ataatctcac tttagacatt tccaaattaa aagaacaaat tttcgaagca 
tcaaaagccc atttaaattt ggtgccagga actgaggcaa ttgcaggagt tgctgatggc 
ctcgcaaatc ttaaccctgt cacttgggtt aagaccattg gaagtactac gattataaat 
ctcatattaa tccttgtgtg cctgttttgt ctgttgttag tctgcaggtg tacccaacag 
ctccgaagag acagcgacca tcgagaacgg gccatgatga cgatggcggt tttgtcgaaa 
agaaaagggg gaaatgtggg gaaaagcaag agagatcaga ttgttactgt gtctgtggcc 
taa 

<210> 82 
<211> 2103 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Modified env sequence 
<400> 82 

Page 89 



Substitute sequence Listing_USSN 10587032_PP019482.007 
atgaacccca gcgagatgca gcgcaaggcc cccccccgcc gccgccgcca ccgcaaccgc 60 
gcccccctga cccacaagat gaacaagatg gtgaccagcg aggagcagat gaagctgccc 120 
agcaccaaga aggccgagcc ccccacctgg gcccagctga agaagctgac ccagctggcc 180 
accaagtacc tggagaacac caaggtgacc cagacccccg agagcatgct gctggccgcc 240 
ctgatgatcg tgagcatggt ggtgagcctg cccatgcccg ccggcgccgc cgccgccaac 300 
tacacctact gggcctacgt gcccttcccc cccctgatcc gcgccgtgac ctggatggac 360 
aaccccaccg aggtgtacgt gaacgacagc gtgtgggtgc ccggccccat cgacgaccgc 420 
tgccccgcca agcccgagga ggagggcatg atgatcaaca tcagcatcgg ctaccactac 480 
ccccccatct gcctgggccg cgcccccggc tgcctgatgc ccgccgtgca gaactggctg 540 
gtggaggtgc ccaccgtgag ccccatctgc cgcttcacct accacatggt gagcggcatg 600 
agcctgcgcc cccgcgtgaa ctacctgcag gacttcagct accagcgcag cctgaagttc 660 
cgccccaagg gcaagccctg ccccaaggag atccccaagg agagcaagaa caccgaggtg 720 
ctggtgtggg aggagtgcgt ggccaacagc gccgtgatcc tgcagaacaa cgagttcggc 780 
accatcatcg actgggcccc ccgcggccag ttctaccaca actgcagcgg ccagacccag 840 
agctgcccca gcgcccaggt gagccccgcc gtggacagcg acctgaccga gagcctggac 900 
aagcacaagc acaagaagct gcagagcttc tacccctggg agtggggcga gaagggcatc 960 
agcacccccc gccccaagat cgtgagcccc gtgagcggcc ccgagcaccc cgagctgtgg 
cgcctgaccg tggccagcca ccacatccgc atctggagcg gcaaccagac cctggagacc 
cgcgaccgca agcccttcta caccatcgac ctgaacagca gcctgaccgt gcccctgcag 
agctgcgtga agccccccta catgctggtg gtgggcaaca tcgtgatcaa gcccgacagc 
cagaccatca cctgcgagaa ctgccgcctg ctgacctgca tcgacagcac cttcaactgg 
cagcaccgca tcctgctggt gcgcgcccgc gagggcgtgt ggatccccgt gagcatggac 
cgcccctggg aggccagccc cagcgtgcac atcctgaccg aggtgctgaa gggcgtgctg 
aaccgcagca agcgcttcat cttcaccctg atcgccgtga tcatgggcct gatcgccgtg 
accgccaccg ccgccgtggc cggcgtggcc ctgcacagca gcgtgcagag cgtgaacttc 
gtgaacgact ggcagaagaa cagcacccgc ctgtggaaca gccagagcag catcgaccag 
aagctggcca accagatcaa cgacctgcgc cagaccgtga tctggatggg cgaccgcctg 
atgagcctgg agcaccgctt ccagctgcag tgcgactgga acaccagcga cttctgcatc 
accccccaga tctacaacga gagcgagcac cactgggaca tggtgcgccg ccacctgcag 
ggccgcgagg acaacctgac cctggacatc agcaagctga aggagcagat cttcgaggcc 
agcaaggccc acctgaacct ggtgcccggc accgaggcca tcgccggcgt ggccgacggc 
ctggccaacc tgaaccccgt gacctgggtg aagaccatcg gcagcaccac catcatcaac 
ctgatcctga tcctggtgtg cctgttctgc ctgctgctgg tgtgccgctg cacccagcag 
ctgcgccgcg acagcgacca ccgcgagcgc gccatgatga ccatggccgt gctgagcaag 
cgcaagggcg gcaacgtggg caagagcaag cgcgaccaga tcgtgaccgt gagcgtggcc 
taa 

<210> 83 
<211> 700 
<212> PRT 

<213> Human endogenous retrovirus, K family (herv-k) 
<400> 83 

Met Asn Pro Ser Glu Met Gin Arg Lys Ala Pro Pro Arg Arg Arg Arg 
15 10 15 

His Arg Asn Arg Ala Pro Leu Thr His Lys Met Asn Lys Met Val Thr 

20 25 30 

ser Glu Glu Gin Met Lys Leu Pro Ser Thr Lys Lys Ala Glu Pro Pro 
35 40 45 

Thr Trp Ala Gin Leu Lys Lys Leu Thr Gin Leu Ala Thr Lys Tyr Leu 
50 55 60 

Glu Asn Thr Lys val Thr Gin Thr Pro Glu Ser Met Leu Leu Ala Ala 
65 70 75 80 

Leu Met lie val ser Met val val ser Leu Pro Met Pro Ala Gly Ala 
85 90 95 

Ala Ala Ala Asn Tyr Thr Tyr Trp Ala Tyr val Pro Phe Pro Pro Leu 
100 105 110 
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lie Arg Ala Val Thr Trp Met Asp Asn Pro Thr Glu val Tyr Val Asn 
115 120 125 

Asp Ser Val Trp val Pro Gly Pro lie Asp Asp Arg Cys Pro Ala Lys 
130 135 140 

Pro Glu Glu Glu Gly Met Met lie Asn lie ser He Gly Tyr His Tyr 
145 150 155 160 

Pro Pro lie Cys Leu Gly Arg Ala Pro Gly Cys Leu Met Pro Ala val 
165 170 175 

Gin Asn Trp Leu val Glu Val Pro Thr Val Ser Pro lie Cys Arg Phe 
180 185 190 

Thr Tyr His Met Val Ser Gly Met Ser Leu Arg Pro Arg val Asn Tyr 
195 200 205 

Leu Gin Asp Phe Ser Tyr Gin Arg ser Leu Lys Phe Arg Pro Lys Gly 
210 215 220 

Lys Pro Cys Pro Lys Glu lie Pro Lys Glu Ser Lys Asn Thr Glu val 
225 230 235 240 

Leu val Trp Glu Glu Cys Val Ala Asn Ser Ala Val lie Leu Gin Asn 
245 250 255 

Asn Glu Phe Gly Thr lie lie Asp Trp Ala Pro Arg Gly Gin Phe Tyr 
260 265 270 

His Asn Cys Ser Gly Gin Thr Gin Ser Cys Pro Ser Ala Gin Val Ser 
275 280 285 

Pro Ala val Asp Ser Asp Leu Thr Glu Ser Leu Asp Lys His Lys His 
290 295 300 

Lys Lys Leu Gin ser Phe Tyr Pro Trp Glu Trp Gly Glu Lys Gly lie 
305 310 315 320 

Ser Thr Pro Arg Pro Lys lie Val Ser Pro val Ser Gly Pro Glu His 
325 330 335 

Pro Glu Leu Trp Arg Leu Thr val Ala Ser His His lie Arg lie Trp 
340 345 350 

Ser Gly Asn Gin Thr Leu Glu Thr Arg Asp Arg Lys Pro Phe Tyr Thr 

355 360 365 

lie Asp Leu Asn Ser Ser Leu Thr val Pro Leu Gin Ser Cys Val Lys 
370 375 380 

Pro Pro Tyr Met Leu val val Gly Asn lie val He Lys Pro Asp Ser 
385 390 395 400 

Gin Thr lie Thr Cys Glu Asn Cys Arg Leu Leu Thr Cys lie Asp ser 
405 410 415 

Thr Phe Asn Trp Gin His Arg lie Leu Leu val Arg Ala Arg Glu Gly 
420 425 430 

Val Trp lie Pro val Ser Met Asp Arg Pro Trp Glu Ala Ser Pro ser 
435 440 445 
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val His lie Leu Thr Glu val Leu Lys Gly val Leu Asn Arg Ser Lys 
450 455 460 

Arg Phe lie Phe Thr Leu lie Ala Val lie Met Gly Leu lie Ala val 
465 470 475 480 

Thr Ala Thr Ala Ala Val Ala Gly val Ala Leu His Ser Ser val Gin 
485 490 495 . 

Ser val Asn Phe val Asn Asp Trp Gin Lys Asn Ser Thr Arg Leu Trp 
500 505 510 

Asn ser Gin Ser Ser lie Asp Gin Lys Leu Ala Asn Gin lie Asn Asp 
515 520 525 

Leu Arg Gin Thr val lie Trp Met Gly Asp Arg Leu Met Ser Leu Glu 
530 535 540 

His Arg Phe Gin Leu Gin cys Asp Trp Asn Thr Ser Asp Phe Cys lie 

545 550 555 560 

Thr Pro Gin lie Tyr Asn Glu Ser Glu His His Trp Asp Met Val Arg 
565 570 575 

Arg His Leu Gin Gly Arg Glu Asp Asn Leu Thr Leu Asp lie ser Lys 
580 585 590 

Leu Lys Glu Gin lie Phe Glu Ala Ser Lys Ala His Leu Asn Leu val 
595 600 605 

Pro Gly Thr Glu Ala lie Ala Gly val Ala Asp Gly Leu Ala Asn Leu 
610 615 620 

Asn Pro val Thr Trp val Lys Thr lie Gly ser Thr Thr lie lie Asn 
625 630 635 640 

Leu lie Leu lie Leu val Cys Leu Phe Cys Leu Leu Leu val cys Arg 
645 650 655 

Cys Thr Gin Gin Leu Arg Arg Asp Ser Asp His Arg Glu Arg Ala Met 
660 665 670 

Met Thr Met Ala val Leu ser Lys Arg Lys Gly Gly Asn val Gly Lys 
675 680 685 

Ser Lys Arg Asp Gin lie val Thr val ser val Ala 
690 695 700 
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